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Treatment  Effect  Heterogeneity  in  Theory  and  Practice 

Instrumental  Variables  (IV)  methods  identify  internally  valid  causal  effects  for  individuals  whose  treatment 
status  is  manipulable  by  the  instnmient  at  hand.  Inference  for  other  populations  requires  homogeneity 
assumptions.  This  paper  outlines  a  theoretical  framework  that  nests  causal  homogeneity  assumptions.  These 
ideas  are  illustrated  using  sibling-sex  composition  to  estimate  the  effect  of  child-bearing  on  economic  and 
marital  outcomes.  The  application  is  motivated  by  American  welfare  reform.  The  empirical  results  generally 
support  the  notion  of  reduced  labor  supply  and  increased  poverty  as  a  consequence  of  childbearing,  but 
evidence  on  the  impact  of  childbearing  on  marital  stability  and  welfare  use  is  more  tenuous. 
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Empirical  research  often  focuses  on  causal  inference  for  the  purpose  of  prediction,  yet  it  seems  fair 
to  say  that  most  prediction  involves  a  fair  amount  of  guesswork.  The  relevance  or  "external  validity"  of  a 
particular  set  of  empirical  results  is  always  an  open  question.  As  Karl  Pearson  (191 1,  p.  157)  observed  in 
an  early  discussion  of  the  use  of  correlation  for  prediction,  "Everything  in  the  universe  occurs  but  once,  there 
is  no  absolute  sameness  of  repetition."  This  practical  difficulty  notwithstanding,  empirical  research  is  almost 
always  motivated  by  a  belief  that  estimates  for  a  particular  context  provide  useful  information  about  the 
likely  effects  of  similar  programs  or  events  in  the  future.  Our  investment  of  time  and  energy  in  often- 
discouraging  empirical  work  reveals  that  empiricists  like  me  are  willing  to  extrapolate. 

The  basis  for  extrapolation  is  a  set  of  assumptions  about  the  cross-sectional  homogeneity  or  temporal 
stability  of  causal  effects.  As  a  graduate  student,  I  learned  about  parameter  stability  as  "the  Lucas  critique," 
while  my  own  teaching  and  research  focuses  on  the  identification  possibilities  for  average  causal  effects  in 
models  with  heterogeneous  potential  outcomes.  Applied  micro-econometricians  devote  considerable 
attention  to  the  question  of  whether  homogeneity  and  stability  assumptions  can  be  justified,  and  to  the 
implications  of  heterogeneity  for  alternative  parameter  estimates.  Regrettably,  this  sort  of  analysis 
sometimes  comes  at  the  expense  of  a  rigorous  examination  of  the  internal  validity  of  estimates,  i.e.,  whether 
the  estimates  have  a  causal  interpretation  for  the  population  under  study.  Clearly,  however,  even  internally 
valid  estimates  are  less  interesting  if  they  are  completely  local,  i.e.,  have  no  predictive  value  for  populations 
other  than  the  directly  affected  group. 

In  this  paper,  I  discuss  the  nature  and  consequences  of  homogeneity  assumptions  that  facilitate  the 
use  of  instrumental  variables  (IV)  estimates  for  extrapolation."  To  be  precise,  I  am  interested  in  the 
assumptions  that  link  a  Local  Average  Treatment  Effect  (LATE)  tied  to  a  particular  instrument  with  the 
population  Average  Treatment  Effect  (ATE),  which  is  not  instrument-dependent.  Implicitly,  I  have  in  mind 


^Denis  Sargan  noted  the  difficulty  of  the  core  instrumental  variables  identification  problem,  i.e., 
identification  in  models  with  constant  effects,  in  his  seminal  1958  paper:  "It  is  not  easy  to  justify  the  basic 
assumptions  concerning  these  errors,  namely  that  they  are  independent  of  the  instrumental  variables."  (p.  396;  quoted 
in  Arellano,  2002). 


prediction  for  populations  defined  by  covariates.  I  focus  on  ATE  because  it  answers  the  question:  "If  we 
were  to  treat  individuals  with  characteristics  X,  what  would  the  hkely  change  in  outcomes  be"?  This  allows 
me  to  sidestep  variability  due  to  changes  in  the  process  determining  treatment  status.  For  example,  causal 
research  often  focuses  on  average  treatment  effects  on  the  treated  population.  Overall  average  treatment 
effects  are  theoretically  more  stable  than  average  effects  on  the  treated,  since  the  latter  depend  on  who  gets 
treated  as  well  as  on  the  distribution  of  potential  outcomes. 

The  external  validity  of  fV  estimates  is  of  special  interest  both  because  of  the  growing  importance 
of  rv  methods  in  empirical  work  (see,  e.g.,  Moffitt,  1 999),  and  because  the  ex  ante  generality  of  IV  estimates 
is  limited  in  a  precise  way  by  a  number  of  well-known  theoretical  results.  Except  in  special  cases  like 
constant  treatment  effects  and  certain  types  of  randomized  trials,  the  standard  IV  assumptions  of  exclusion 
and  independence  -  analogous  to  the  notion  that  the  instrument  induces  a  good  experiment  for  the  effect  of 
interest  -  are  not  sufficient  to  capture  the  expected  causal  effect  on  a  randomly  selected  individual  or  even 
in  the  population  subject  to  treatment.  Rather,  basic  IV  assumptions  identify  causal  effects  on  "compliers," 
defined  as  the  subpopulation  of  treated  individuals  whose  treatment  status  can  be  influenced  by  the 
instrument.  Although  this  limitation  is  unsurprising,  the  nature  and  plausibility  of  assumptions  under  which 
IV  estimates  have  broader  predictive  power  are  worth  exploring. 

The  next  two  sections  develop  a  theoretical  framework  linking  alternative  causal  parameters  to 
population  subgroups  defined  by  their  response  to  an  instalment.  That  is,  I  consider  formal  finks  between 
parameters  like  LATE  and  ATE.  My  agenda  is  to  make  this  link  using  a  range  of  assumptions,  progressing 
from  stronger  (no  selecfion  bias)  to  weaker  (a  proportionality  assumption).  These  theoretical  ideas  are  then 
applied  to  the  same  sex  instrument,  used  by  An  grist  and  Evans  ( 1 998)  to  estimate  the  effects  of  childbearing 
on  labor  supply.  This  instrument  arises  from  the  fact  that  some  parents  prefer  a  mixed  sibling  sex 
composition.  In  particular,  among  parents  who  have  at  least  two  children,  those  with  two  boys  or  two  girls 
are  much  more  likely  to  go  on  to  have  a  third  child.   Because  child  sex  is  virtually  randomly  assigned,  a 


dummy  for  same  sex  sibling  pairs  provides  an  instrumental  variable  that  can  be  used  to  identify  the  effect 
of  childbearing  on  a  range  of  economic  and  family  outcomes. 

My  earlier  work  with  Evans  using  the  same  se.x  instmment  focused  on  the  effects  of  childbearing  on 
labor  supply.  While  labor  supply  outcomes  also  appear  in  this  paper,  the  empirical  work  features  an 
investigation  of  the  effects  of  childbearing  on  marital  stability.  An  inquiry  into  the  effects  of  family  size  on 
marital  stability  can  be  motivated  by  American  welfare  reform,  which  penalizes  further  childbearing  by 
women  receiving  public  assistance  on  the  grounds  that  increases  in  family  size  make  continued  poverty  and 
welfare  receipt  more  likely.  I  therefore  look  at  effects  on  marital  status,  poverty  status,  and  welfare  use,  as 
well  as  labor  supply.  Estimates  of  ATE  for  the  effects  of  childbearing  are  generally  smaller  than  estimates 
of  LATE.  For  example,  while  estimates  of  LATE  for  the  effect  of  childbearing  on  welfare  use  and  marital 
stability  are  mostly  significantly  different  from  zero,  most  (though  not  all)  of  the  estimates  of  ATE  for  effects 
of  childbearing  on  marital  status  and  welfare  use  are  small  and  insignificant.  One  tantalizing  result  is  that 
for  teen  mothers,  LATE  appears  to  be  virtually  identical  to  the  population  average  treatment  effect  when  the 
latter  is  imputed  under  the  assumptions  considered  below. 

The  empirical  results  suggest  a  pattern  of  modest  effects,  but  the  variability  in  parameter  estimates 
across  model  specifications  and  samples,  as  well  as  the  usual  problem  of  more  imprecise  estimates  under 
weaker  identifying  assumptions,  reduces  the  predictive  value  any  findings.  On  balance  it  seems  fair  to  say 
that  the  attempt  to  go  from  LATE  to  ATE  weakens  the  evidence  for  an  adverse  effect  of  childbearing  on 
marital  stability  and  welfare  use,  but  the  estimates  of  ATE  do  not  provide  a  sharp  alternative  to  LATE.  This 
is  perhaps  not  surprising,  given  the  difficulty  of  the  underlying  identification  problem.  As  in  the 
experimental  sciences,  the  best  evidence  for  predictive  value  is  likely  to  come  from  new  data  sets  and  new 
experiments,  which  in  the  case  of  applied  econometrics  usually  means  new  instruments. 


1.  Causality  and  Potential  Outcomes  in  Research  on  Childbearing 

The  effects  of  children  on  marital  stability  have  long  been  of  interest  to  social  scientists  and  are  of 
course  of  more  than  academic  interest  to  many  married  couples.  Previous  research  (e.g.,  Becker,  Lanes,  and 
Michael,  1977;  Cherlin,  1977;  Heaton,  1990  and  Waite  and  Lillard,  1991)  suggests  the  presence  of  young 
children  increases  marital  stability,  although  many  authors  acknowledge  serious  selection  problems  that  may 
bias  results.  A  related  issue  is  the  connection  between  childbearing  and  women's  standard  of  living.  A  large 
literature  looks  at  the  effect  of  teen  childbearing  on  mothers'  schooling,  earnings,  and  welfare  status, 
sometimes  using  instrumental  variables  (e.g.,  Bronars  and  Grogger,  1994).  Interest  in  this  question  can  be 
motivated  by  welfare  reform,  which  include  "family  caps"  in  many  U.S.  states.  Family  caps  reduce  or 
eliminate  benefits  paid  for  children  bom  to  welfare  recipients,  on  the  theory  that  further  childbearing  by 
welfare  mothers  increases  the  likelihood  they  will  stay  poor  and  therefore  continue  to  receive  benefits.' 

Are  children  the  glue  that  holds  couples  together  or  a  burden  that  accelerates  a  fragile  family's 
collapse?  Does  childbearing  further  impoverish  poor  women?  Implicit  in  these  questions  is  the  notion  of 
potential  outcomes,  i.e.,  a  contrast  in  circumstances  with  and  without  childbearing,  for  a  given  family.  To 
represent  this  idea  formally,  let  D^  be  an  indicator  for  women  with  more  than  two  children  in  a  sample  of 
women  with  at  least  two  children.  Because  D,  is  binary,  I  will  refer  to  it  as  a  "treatment,"  even  though 
family  size  is  not  determined  directly  by  a  program  or  policy.  Let  Y,,  be  a  woman's  circumstances  if  D,=l, 
and  let  Yg,  be  her  circumstances  otherwise.  We  imagine  both  of  these  potential  outcomes  are  well-defined 
for  everyone,  though  only  one  is  ever  observed  for  each  woman.  Formally,  this  can  be  expressed  by  writing 
the  observed  outcome,  Y„  as 

Y,  =  Y„,(l-D,)  +  Y„Di. 

For  both  practical  and  substantive  reasons,  I  focus  here  on  fertility  consequences  defined  with 


'See  Maynard,  et  al  (1998)  for  more  on  the  motivation  for  family  caps.  The  possibility  of  a  link  between 
childbearing  and  poverty  notwithstanding,  there  is  little  evidence  that  family  caps  actually  affect  fertility  behavior. 
See,  e.g.,  Grogger  and  Bronars  (2001),  Blank  (2002)  or  Kearney  (2002). 


reference  to  the  transition  from  two  to  more  than  two  children.  On  the  practical  side,  instalments  based  on 
sibling  sex  composition  are  available  for  this  fertility  increment.  Angrist  and  Evans  (1998)  used  parents 
preferences  for  a  mixed  sibling  sex  composition  to  estimate  the  labor  supply  consequences  of  childbearing. 
On  the  substantive  side,  post-war  reductions  in  marital  fertility  have  been  concentrated  in  the  2-3  child  range 
(see,  e.g.,  Westoff,  Potter,  and  Sagi,  1963).  While  almost  all  couples  want  at  least  one  child,  the  decision 
to  have  a  third  child  may  be  due  in  part  to  a  sense  of  whether  this  is  good  for  long-term  marital  stability  or, 
more  generally,  the  economic  welfare  of  the  family.  Finally,  interest  in  the  2-to-3  child  increment  is 
supported  by  the  fact  that  the  population  of  welfare  mothers  in  1990  had  an  average  of  about  2.3  children 
and  a  median  of  2  children. 

Since  both  Y,,  and  Y„,  are  never  both  obser\'ed  for  the  same  woman,  research  on  causal  effects  tries 
to  capture  the  average  difference  in  potential  outcomes  for  different  subpopulations.  For  example,  we  may 
be  interested  in  E[Y,,  -  Y„,|  D,=  l],  which  is  the  effect  on  women  who  have  a  third  child.  Note  that  E[Y|,| 
D,=  1  ]  is  an  observed  quantity,  so  estimating  E[Y|  j  -  Yn,|  D  =  1  ]  is  equivalent  to  estimating  the  counter-factual 
average,  E[  Y(,,|  D,=  1  ] .  Alternately,  we  may  be  interested  in  the  unconditional  average  treatment  effect  (ATE), 
E[Y|;  -  Yg,],  which  can  be  used  to  make  predictive  statements  about  the  impact  of  childbearing  on  a  randomly 
chosen  woman  (or  a  woman  with  a  particular  set  of  characteristics  if  the  analysis  conditions  on  covariates). 
Estimation  of  ATE  is  equivalent  to  estimation  of  both  counterfactual  averages,  E[Yo,|  D,=  l  ]and  £[¥,,!  D,=0]. 

Causal  parameters  are  easy  to  describe  but  hard  to  measure.  The  observed  difference  in  outcomes 
between  those  with  D  =  l  and  D,=0  equals  E[Y,,  -  Yo,|  D  =  l]  plus  a  bias  term: 

E[Y,|D  =  1]-  E[Y,|D,=0]  =  E[Y„|D,=  1]-  E[Yo,|D,=0]  (1) 

=  E[Y„-YoJ  D=l]  +{E[Yo,|  D,=  l]-E[Yo,|  D-0]}. 
The  bias  term  disappears  when  childbearing  is  detennined  in  a  manner  independent  of  a  woman's  potential 
outcomes.  But  this  independence  assumption  seems  unrealistic  since  childbearing  decisions  are  made  in  light 
of  information  about  family  circumstances  and  earnings  potential. 


Two  sorts  of  strategies  are  typically  used  to  estimate  causal  effects  in  the  presence  of  possible 
omitted  variables  bias.  One  assumes  that  conditional  on  covariates,  Xj,  the  regressor  of  interest,  Dj,  is 
independent  of  potential  outcomes.  Then  any  causal  effect  of  interest  can  be  estimated  from  weighted 
conditional-on-X  comparisons.  This  is  a  strong  assumption  that  seems  most  plausible  when  researchers  have 
considerable  prior  inforrnation  about  the  process  determining  Dj.  Alternately,  we  might  try  to  find  an 
instrumental  variable  which,  perhaps  after  conditioning  on  covariates,  is  related  to  Dj  but  independent  of 
potential  outcomes.  The  instrument  used  here  is  a  dummy  variable  indicating  same-sex  sibling  pairs. 

2.  IV  in  context 

TV  estimates  capture  the  effect  of  treatment  on  the  treated  for  those  whose  treatment  status  can  be 
changed  by  the  instrument  at  hand.  This  idea  is  easiest  to  formalize  using  a  notation  for  potential  treatment 
assignments  that  parallels  the  notation  for  potential  outcomes.  In  particular,  let  Doi  and  D,,  denote  potential 
treatment  assignments  indexed  relative  to  a  binary  instrument.  Suppose,  for  example,  Dj  is  determined  by 
a  latent-index  assignment  mechanism, 

D,=  l(Y„  +  Y,Zi>7li),  (2) 

where  Z,  is  a  binary  instrument,  and  x],  is  a  random  error  independent  of  the  instrument.  Then  the  potential 
treatment  assignments  are  Do,  =  1[Yo  >  '^\\  and  D,,  =  1[Yo  +  Yi  >  'H,])  both  of  which  are  independent  of  Z|. 

The  constant-effects  latent-index  assignment  model  is  restrictive  since  it  implies  D,,  >  D,,,  for  all  i, 
or  vice  versa.  We  can  relax  this  restriction  by  allowing  a  random  Yn  for  each  i,  in  which  case  the  latent  index 
model  is  just  an  alternative  notation  for  Dqi  and  D,;.  Whether  linked  to  an  index  model  or  note,  Dq,  tells  us 
what  treatment  i  would  receive  if  Zi=0,  and  D,j  tells  us  what  treatment  i  would  receive  if  Z =1 .  The  observed 
assignment  variable,  Dj,  can  therefore  be  written: 

D,  =  Do,(1-Z,)  +  D„Z, 
This  notation  makes  it  clear  that,  paralleling  potential  outcomes,  only  one  potential  assignment  is  ever 


observed  for  a  particular  individual. 

The  key  assumptions  supporting  IV  estimation  are  given  below  (for  a  model  without  covariates): 

Independence.  {Yo^,  Yi^,  Dq,,  D,,)  n  Z,. 

FlRSTSTAGE.  P[D  =  1|Z  =  1]  ^P[D  =  1|  Z,=0]. 

MoNOTONICITY.  Either  D,,  >  D,,,  V  i  or  vice  versa;  without  loss  of  generality,  assume  the  fonner. 

These  assumptions  capture  the  notion  that  the  instrument  is  "as  good  as  randomly  assigned"  (independence), 

affects  the  probability  of  treatment  (first-stage),  and  affects  everyone  the  same  way  if  at  all  (monotonicity). 

Imbens  and  Angrist  [1994]  show  that  together  they  imply: 

E[Y,|Z,=  1]-E[Y,|Z,=0] 

=  E[Y„-Yo,|  D„  >  DJ. 


E[D,iZ-l]-E[D,|Z=0]. 
The  left-hand  side  of  this  expression  is  the  population  analog  of  Wald's  (1940)  estimator  for  regression 
models  with  measurement  error.  The  Local  Average  Treatment  Effect  (LATE)  on  the  right  hand  side,  E[Y|,- 
YqJ  D|,  >  Dq,],  is  the  effect  of  treatment  on  those  whose  treatment  status  is  changed  by  the  instrument,  i.e., 
the  population  for  which  U,  =1  and  Do=0.'' 

A  standard  assumption  invoked  in  most  empirical  studies  is  constant  causal  effects,  i.e., 

Y„  =  Y„,  +  a, 
for  some  constant  a.     In  the  childbearing  application,  a  constant  effects  assumption  implies  that  IV 
consistently  estimates  the  common  effects  of  childbearing  on  all  women,  since,  given  constant  effects,  E[Y,i- 
YqJ  D,,  >  Doi]=a.  The  LATE  result  above  highlights  the  fact  that  in  a  more  realistic  world  where  this  effect 


'Proof  of  the  LATE  result:  E[Y,|  Z  =  1]=E[Y„,  +  (Y,,-Y„JD,|  Z  =  l],  which  equals  £[¥„,  +  (Y, -Y„,)D|,]  by 
independence.  Likewise  E[YJ  Z=0]=E[Y„,  +  (Y|,-Y„,)D„,  ],  so  the  Wald  numerator  is  E[(Y, -Yo,)(D| -D„,)]. 
Monotonicity  means  D, -D„,  equals  one  or  zero,  so  E[(Y|,-Y„J(D,,-D„,)]=E[Y| ,-Y„,|D,  >D„,]P[D|  >Do,].  A  similar 
argument  shows  E[D,|  Z,=  1]-E[D,|  Z,=0]  =  E[D„-Do,]=P[D,  >DJ. 


varies  (and  indeed  it  must  vary  if,  for  example,  Y,  is  a  binary  outcome  or  other  variable  with  limited  support), 
then  we  can  be  sure  only  that  IV  captures  the  effect  on  individuals  whose  treatment  status  can  be  changed 
by  manipulating  Zj.  These  are  people  with  0,,=  !  and  Do,=0,  or  D, -Doi=l .  Note  also  that  since  Dn  and  Dq, 
are  defined  with  reference  to  a  particular  instrument,  then  -  again,  in  the  absence  of  additional  assumptions 
-we  should  expect  different  instruments  to  uncover  different  average  causal  effects.  We  might,  for  example, 
expect  an  IV  strategy  based  on  the  same  sex  instrument  to  identify  a  different  average  effect  than  an 
instrument  based  on  twin  births.  In  fact,  Angrist  and  Evans  (1998)  report  IV  estimates  using  twin  birth 
instruments  that  are  much  lower  than  those  using  same  sex  instruments. 

Angrist,  Imbens,  and  Rubin  (1996)  refer  to  people  with  D||-Doi=l  as  the  population  oi compilers. 
This  terminology  is  motivated  by  an  analogy  to  randomized  trials  where  Z,  is  a  randomized  offer  of  treatment 
and  D|  is  actual  treatment  status.  Since  D,i-Doi=l  implies  D,=Zj,  compliers  are  those  who  comply  with  an 
experimenter's  intended  treatment  status  (though  not  all  those  with  D,=Zj  are  compliers,  as  explained  below). 
For  compliers,  the  averages  of  Yj,  and  Yf,|  as  well  as  the  average  difference  are  also  identified.  In  particular, 
Abadie  (2002)  shows  that 


E[Y,D,|Z,=l]-E[Y,Di|Z,=0] 

E[D,|Z,=  1]-E[D,|Z,=0] 
E[Y,(l-Di)|Z,=  l]-E[Y,(l-D,)|Z,=0] 


E[Y„|D„>Do,]  (3a) 


E[Yo,|D„>Doi].  (3b) 


E[(l-D,)|Z,=  l]-E[(l-Di)|Zi=0] 
The  entire  (marginal)  distributions  of  Y,  j  and  Yqi  are  similarly  identified,  a  fact  used  by  Abadie,  Angrist,  and 
Imbens  (2002)  to  estimate  the  causal  effect  of  treatment  on  the  quantiles  of  potential  outcomes  for  compliers. 
An  important  econometric  result  in  the  theory  of  causal  effects  is  that  when  treatment  is  assigned 
by  a  mechanism  like  (2),  population  average  treatment  effects  and  the  effect  on  the  treated  are  not  identified 
without  assumptions  such  as  constant  effects  or  some  other  assumption  beyond  the  3  given  above.  This 
result  or  theorem  appears  in  various  forms;  see,  for  example.  Chamberlain  (1986),  Heckman  (1990),  and 
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Angrist  and  Imbens  ( 1 99 1 ).  The  next  section  develops  a  framework  that  highlights  the  limits  to  identification 
and  the  role  played  by  alternative  homogeneity  assumptions  in  efforts  to  go  beyond  LATE.  The  Same  sex 
instrument  offers  an  especially  challenging  proving  ground  for  these  ideas  since  at  most  7%  of  American 
women  have  an  additional  child  as  a  result  of  sex  preferences.  Causal  effects  on  same  sex  compilers  can 
therefore  be  quite  far  from  overall  average  effects  if  the  impact  of  childbearing  on  these  women  is  not  typical. 
Before  turning  to  a  general  discussion  of  treatment  effect  heterogeneity,  I  briefly  explore  the  relationship 
between  LATE,  ATE,  and  effects  on  the  treated  in  a  parametric  model  that  mimics  the  same  sex  setup. 

2.1  A  Parametric  Example 

Following  Heckman,  Tobias,  and  Vytlacil  (2001),  I  calculated  average  causal  effects  using  a 
trivariate  Normal  model  for  the  joint  distribution  of  potential  outcomes  and  the  error  term  in  the  latent-index 
assignment  mechanism  given  by  equation  (2).  Assuming  the  distribution  of  [Y,,  Y„,  r|,]'  is  joint  standard 
Normal,  ATE  is  zero  by  construction.  Assume  also  that  Yi>0  so  monotonicity  is  satisfied  with  D|,>Do,  and 
let  p,„  be  the  correlation  between  Y,,- Yq,  and  r|,.  In  this  parametric  model,  LATE  can  be  written: 

E[Y„-Y„,|  D„  >  Do,]  =  E[Y„-Y„,|  Yo+Y,>  ti,  >  Yo]  (4) 

=  P,o{[4)(Yo)-4>(Yo+Y,)][*(Yn+Yi)-^(Yo)]'}, 
where  ({)(■)  and  $(■)  are  the  Normal  density  and  distribution  functions.  Similarly,  we  can  use  Normality  to 
write  the  effect  on  the  treated  as: 

E[Y„-Y„,|  D,=  l]  =  E{E[Y„-Yo,|  Yo+Y,Z,>ti„  Z,]|  D  =  I}  (5) 

=  -p,o{A(Yo+Yi)E[Z,|  D  =  l]  +  A(Yo)(l-E[Z,|  D  =  l])}. 
where  A()  is  the  inverse  Mill's  ratio,  (|)(  )/3>().    This  formula  is  useful  for  calculation,  but  the  following 
expression  better  clarifies  the  difference  between  LATE  and  the  effect  on  the  treated: 

E[Y„- Y„,|  D  =  l]  =  E[Y„-Yo,|  Yo  +  Y,  >  ^,  >Yo]"  +  E[Y„-Yo,|  Yo  >  ti,](1  -a>),  (6) 

where  w={$(Yo+Yi)-'5(Yo)}[P(Z,=  1)/P(D,=  1)]  and  1-u=$(yo)/P(D  =  1).  Equation  (6)  shows  the  effect  on 


the  treated  to  be  a  weighted  average  of  LATE  and  the  average  effect  on  those  with  Yo  >  "H,,  with  weights  that 
depend  on  the  first  stage  and  the  distribution  of  Zj. 

2.1.1  LATE  vs.  The  Effect  on  the  Treated 

LATE  and  the  effect  on  the  treated  both  depend  on  the  correlation  between  potential  outcomes  and 
the  latent  first-stage  error,  and  on  the  first-stage  coefficients.  The  effect  on  the  treated  also  depends  on  the 
distribution  of  the  instrument.  The  relationship  between  alternative  causal  parameters  in  the  parametric 
model  is  sketched  in  Fig.  1 ,  which  plots  ATE  (  a  constant  equal  to  zero),  LATE,  and  the  effect  on  the  treated 
against  <J(Yo)  for  a  fixed  first  stage  of  .07  and  an  instrument  that  is  Bemoulli(.5).  In  other  words,  as  with  the 
same  sex  instrument  in  Angrist  and  Evans  ( 1 998),  the  simulated  instrument  is  a  dummy  that  equals  one  with 
probability  V2 ,  and  increases  the  probability  that  D;  equals  1  by  7  percentage  points.  The  top  panel  of  Fig. 

1  sets  pio  =  -.1,  so  that  the  probability  of  treatment  increases  with  the  gains  from  treatment,  as  in  a  Roy 
(1951)  model,  while  the  bottom  panel  sets  p.o  =  -.5  for  stronger  selecfion  on  gains.  With  positive  p,o,  the 
figure  would  be  reflected  through  the  horizontal  axis. 

The  leftmost  point  in  the  figure  shows  that  LATE  equals  the  effect  on  the  treated  when  <I>(Yo)=E[D|| 
Z,=0]=0.  This  is  incompatible  with  the  Normal  latent-index  model  since  it  requires  Yo~~  °°>  but  E[Dj|  Zj=0]=0 
is  an  important  special  case  in  practice,  most  commonly  in  randomized  trials  with  partial  compliance  in  the 
treated  group  only  (see,  e.g..  Bloom,  1 984  or  Angrist  and  Imbens,  1991).^  At  the  other  end  of  the  figure,  the 
effect  on  the  treated  approaches  the  overall  average  effect  when  almost  everyone  gets  treated.  Finally,  Fig. 

2  shows  that  increasing  the  size  of  the  first  stage  effect  from  .07  to  .30  pulls  both  LATE  and  the  effect  on 


'A  leading  example  is  the  randomized  trial  used  to  evaluate  subsidized  training  programs  offered  through 
the  Job  Training  Partnership  Act,  one  of  America's  largest  Federally-sponsored  training  programs.  Subsidized 
training  was  offered  but  not  compulsory  in  the  randomly  selected  treatment  group.  About  60  %  of  those  offered 
treatment  took  up  the  offer,  so  E[Dj|  Z,=  1  ]=.6,  where  Zj  is  the  randomized  offer  of  treatment  and  D,  is  acUial  training 
status.  On  the  other  hand,  (virtually)  no  one  in  the  control  group  received  treatment,  so  E[D||  Z|=0]  =  0.  In  this  case, 
LATE  is  the  effect  on  the  treated  because  the  set  of  al ways-takers  is  virtually  empty.  See  Orr,  et  a/  (1996)  for  an  IV 
analysis  of  the  JTPA. 
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the  treated  closer  to  the  overall  average  effect. 

The  effect  of  treatment  on  the  treated  is  above  LATE  for  all  first-stage  baseline  values,  a 
consequence  of  the  fact  that  selection  on  gains  makes  E[Y|,-YnJ  Yo  >  'H,]  bigger  than  LATE.  Moreover, 
LATE  provides  a  better  measure  of  the  effect  of  treatment  on  a  randomly  chosen  individual  (ATE)  than  does 
the  effect  on  the  treated  for  most  parameter  values.  A  final  important  feature  of  the  figure  (also  apparent 
from  equation  (4))  is  that  LATE=ATE  when  Yi=''2yo  since  4>(Yo)=4'("Y[i)  by  symmetry  of  the  Normal 
density.  Thus,  as  noted  by  Heckman  and  Vyilacil  (2000),  a  "symmetric  first  stage"  that  changes  the 
probability  of  treatment  from/?  to  \-p  implies  LATE  equals  ATE  in  the  Normal  model,  or  in  any  latent 
variable  model  with  jointly  symmetric  errors.* 

3.  Identification  Problems  and  Prospects 

Angrist,  Imbens,  and  Rubin  (1996)  show  that  the  potential-outcomes  framework  for  IV  divides  a 
population  into  three  groups,  which  I  refer  to  below  as  "potential-assignment  subpopulations."  The  first  are 
compilers,  i.e.,  those  for  whom  D|,=  l  and  Do,=0.  In  the  latent  index  model,  compilers  have  Yo+Yi>'n,>Yo-  The 
other  two  groups  include  individuals  whose  treatment  status  is  unaffected  by  the  instrument.  One  consists 
of  never-takers,  with  D,  =D|,=0.  Never-takers  are  never  treated  regardless  of  the  value  of  Z,  to  which  they 
might  be  exposed.  In  the  latent  index  model,  never-takers  have  r|,>Yo+Yi-  The  second  unaffected  group 
consists  ofalways-takers,  with  D,  =Df,  =  l .  Always-takers  are  always  treated  regardless  of  the  value  of  Z,  to 
which  they  might  be  exposed.  In  the  latent-index  model,  always-takers  have  Yo^'Hi-  A  possible  fourth  group 
with  Doi=l  and  D,  =0  is  empty  by  virtue  of  the  monotonicity  assumption. 

The  set  of  the  treated  is  the  union  of  the  disjoint  sets  of  always-takers  and  compilers  with  Z  =  l .  This 


"Joint  symmetry  means  that  if  AVj,.  fl,)  is  the  joint  density  of  yj=Yj,-E[Yj,]  and  r|,,  then/f-y,,  -T|^=/(y„  t\,). 
A  weaker  condition  with  the  same  result  (a  symmetric  first  stage  ranging  from p  to  1  -p  gives  LATE=ATE)  is  that 
EfyJ  Ti|]  is  an  odd  function  (as  for  a  linear  model)  and  that  r|,  has  a  symmetric  distribution.  Angrist  ( 1 991 )  somewhat 
more  loosely  noted  that  IV  estimates  should  be  close  to  ATE  when  the  first  stage  changes  the  probability  of 
treatment  at  values  centered  on  one-half,  as  is  required  for  the  first  stage  to  be  symmetric. 
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provides  an  interpretation  for  tlie  following  identity: 

D,  =Doi  +  (D,-D„,)Z„ 
since  Doi=l  indicates  always-takers  and  {Dy-D(i^)Z,,  indicates  compiiers  with  Z,=l.  Since  Z|  is  independent 
of  complier  status,  compiiers  with  Zj=l  are  representative  of  all  compiiers.  Causal  effects  on  the  treated  can 
therefore  be  decomposed  as: 

E[Y,i  -  Yo;i  D  =  l]  =  E[Y,,  -  Yo,|  Do,>D„]{l-P(Do=D„=l|  D,=  l))  +  (7) 

E[Y„  -  Yo,|  Do,=D,  =1]P(D„,=D„=1|  D,=  l)). 
Equation  (7)  generalizes  (6),  which  gives  the  same  decomposition  for  the  Normal  model.    Because  an 
instrumental  variable  provides  no  information  about  average  treatment  effects  in  the  set  of  always-takers, 
LATE  is  identified  while  E[Y,,-Y(,,|  D,=  ]]  is  not. 

To  further  pinpoint  the  identification  challenge  in  this  context,  note  that  E[Y,,|  Do,=D,j=]  ]  and  E[YoJ 
Doi=D,j=0]  can  be  estimated  using  the  following  relations: 

E[Y,J  DorD„=l]  =  E[Y„|  Do  =  l]  =  E[Y,|  D,=  l,  Z,=0]  (8a) 

E[Yoi|  Do,=D„=0]  =  E[Yo,|  D„=0]  =  E[Y.|  D,=0,  Z,=  l].  (8b) 

The  missing  pieces  of  the  identification  puzzle  are  therefore  the  fully  counter-factual  averages,  E[Y,,| 
DorD„=0]andE[Yo,|Do,=D„=l]. 

3.1  Restricting  Potential-Assignment  Subpopidations 

The  conditional  expectation  functions  (CEFs)  of  Y,,  and  Yqj  given  potential  assignments  provide  a 

framework  for  the  discussion  of  alternative  identification  strategies.  These  CEFs  can  be  written: 

E[Y,jDoi,D,i]  =  a,  +  p,oDo,  +  PnD„  (9a) 

E[Yo,|Do„D„]  =  ao+PooDo,  +  Po,D,,  (9b) 

Equations  (9a)  and  (9b)  impose  no  restrictions  since  there  are  three  potential-assignment  subpopulations  and 

three  parameters  in  each  CEF.  The  6  conditional  means,  E[Yj,|  Dq,,  D,,],  are  uniquely  determined  by  (9a,b) 
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CEF  for  Y„. 

CEF  for  Y,. 

Co  +  Poi 

K,  +  Pn 

Co  +  Poo  +  Poi 

a, +  P10  +  P1. 

tto 

a, 

as  follows: 

Group  Definition  Indicator 

Compliers  D,,=  l,Do,=0       D|,-Do, 

Always-takers    D|=Du=l  Dni 

Never-takers      D,,=Do,=0  l-D,, 

The  CEF  for  observed  outcomes,  E[Y,|  D„  Z,],  has  a  distribution  with  4  points  of  support,  while  the 

CEFs  of  Yo,  and  Y,,  given  D,,,  and  D,,  depend  on  6  parameters.  This  suggests  the  latter  are  not  identified  from 

the  former  without  additional  restrictions,  a  result  implied  by  the  theorem  below. 

THEOREM:  Suppose  the  Independence,  First-Stage,  and  Monotonicity  assumptions  hold  and  that  YQ^and  Y,, 
have  multinomial  distributions.  Let  f(,(yl  D,,,  DgJ  and  f,(y/  D,,,  DoJ  denote  the  conditional  distribution 
functions  for  potential  outcomes  given  potential  assignments  and  letfyp^fy,  d,  z)  denote  the  joint  distribution 
ofY,.  D,,  and  Z,.    Then  ffy/  Dj,,  DJ  and  f(y/  D,„  DgJ  are  not  identified  from  fy^^fy,  d,  z). 

Proof:  Factor  the  d.f  using  fvozCY'  d,  z)  =  fyiDzCyl  d,  z)g|32;(d,z).  The  second  term  is  unrestricted.  Let 

f/y|  D,„  Do,)  =  a/y)  +  P^o(y)D„,  +  P„(y)D,„ 
substitute  into  fypzCY'  '^'  z),  and  iterate  expectations  to  obtain  the  multinomial  likelihood  solely  as  a  function 
of  the  parameters  determining  fo(y|  D||,  D,,,)  and  f|(y|  D,„  Dq,).  Finally,  substitute  for  fo(y|  D,,,  D^)  and  f|(y| 
D|,,  D,,,)  to  show  the  likelihood  is  invariant  to  the  choice  of  Pon(y)  and  P,|(y)  as  long  as  a,(y)+P|,(y)  is 
constant.  Non-identification  of  Po(,(y)  implies  non-identification  of  the  marginal  distribution  of  Yqi  while 
non-identification  of  p,|(y)  implies  non-identification  of  the  marginal  distribution  of  Y,,. 

The  multinomial  distributional  assumption  raises  the  question  of  how  general  the  theorem  is.    It  seems 

general  enough  for  practical  purposes  since,  as  noted  by  Chamberlain  (1987),  any  distribution  can  be 

approximated  arbitrarily  well  by  a  multinomial.    Moreover,  I'd  like  to  rule  out  identification  based  on 

continuity  or  support  conditions  to  avoid  paradoxes  such  as  "identification  at  infinity".' 

3.2  A  Menu  of  Restrictions 

A  variety  of  restrictions  on  (9a,b)  are  sufficient  to  identify  ATE.   I  briefly  discusses  4  cases  that 
strike  me  as  being  of  special  interest.  The  simplest  is  ignorable  treatment  assignment  or  "no  selection  bias." 


'See  Chamberlain  (1986).  The  multinomial  assumption  has  some  content  since  it  implies  that  potential 
outcomes  have  bounded  support,  so  that  ATE  and  effects  on  the  treated  are  bounded.  See  Manski  ( 1 990)  or 
Heckman  and  V>tlacil  (2000). 
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Restriction  1  (NO  SELECTION  BIAS).  Poo=Poi=Pio=Pii=0; 

This  implies  LATE  =  a|-a(,  =  ATE.  Under  Restriction  1,  ATE  can  be  estimated  from  simple  treatment- 
control  comparisons. 

Because  the  assumption  of  no  selection  bias  involves  four  restrictions  while  two  would  be  sufficient, 
ATE  is  over-identified  in  this  case.*  A  standard  Hausman  (1978)  test  for  endogeneity  exploits  over- 
identification  by  comparing  IV  and  OLS  estimates,  equivalent  here  to  a  comparison  of  Wald  estimates  with 
treatment-control  differences.  A  modified  and  potentially  more  powerful  test  can  be  based  on  the  fact  that 
under  Restriction  1,  E[Yii|  Doj=l]=a|  and  E[Yo,|  D|i=0]=ao.  Using  (8a,b),  this  suggests  the  following 
specification  test; 

Test  for  Selection  Bias. 

E[Y,|Z,=  l]-E[Yi|Z-0] 

={E[Y,|D,=  1,Z,=0]-E[Y,|D,=0,Z,=  1]}.  (Tl) 


E[D,|Z,=  1]-E[D;1Z.=0]. 


In  the  appendix,  I  show  how  a  test  statistic  based  on  Tl  can  be  computed  using  regression  software. 

The  Hausman  test  for  selection  bias  replaces  E[Y,|  D,=  l,  Z=0]-E[YJ  Di=0,  Z  =  l]  on  the  right  hand 
side  of  Tl  with  E[Yj|  D|=1]-E[Y,|  Di=0].  The  Hausman  test  will  also  work  in  the  causal  framework  outlined 
here  since  under  Restriction  1  both  OLS  and  FV  estimate  ATE.  The  difference  between  Tl  and  a  Hausman 
test  arises  from  the  fact  that  the  Hausman  test  implicitly  compares  E[Y|,|D,,>Do,]  withE[Yj||  Dj=j]  forj=0,l, 
while  T 1  implicitly  compares  E[Yjj|  D,  i>Doi]  with  E[  Yj,|  D,  j=Doi=j]  for  j=0, 1 .  These  two  pairs  of  comparisons 
are  the  same  under  monotonicity  but  not  in  general.  The  empirical  results  below  suggest  that  Tl ,  which  uses 


'a  weaker  version  of  Restriction  1  with  P||=Poi,=0  is  also  sufficient  to  identify  ATE  since  this  equates  never- 
takers  with  compliers  for  the  CEF  of  Y,,  and  always-takers  with  compliers  for  the  CEF  of  Y^,.  This  seems  no  easier  to 
motivate  than  Restriction  1 ,  so  I  limit  the  discussion  to  the  over-identified  case. 
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monotonicity,  indeed  provides  a  more  powerful  specification  test.' 

While  pivotal  for  specification  testing,  the  assumption  of  no  selection  bias  is  an  unattractive  basis 
for  causal  inference  since  the  use  of  IV  is  motivated  by  the  possibility  of  selection  bias.  An  alternative 
assumption  that  allows  for  selection  bias  amounts  to  the  claim  that  the  difference  between  Y,,  and  Y„,  is 
mean-independent  of  potential  treatment  assigmnents.  I  refer  to  this  as  "conditional  constant  effects." 
Formally,  this  means: 

Restriction  2  (conditional  constant  effects).  p„o=P,o;  Poi=Pii- 

This  pair  of  restrictions  is  just  sufficient  to  identify  ATE.  In  particular,  we  again  have  LATE  =  ai-a,,  = 
ATE.,  or,  equivalently,  E[Y|  — Y(„|  D,„  D^,]  =  E[Y,  — Yd,].  While  restriction  2  allows  for  selection  bias  in  the 
sense  that  Y,,  and  Y,,,  are  correlated  with  potential  treatment  assignments,  the  correlation  is  restricted  to  be 
the  same  for  both  potential  outcomes,  so  that  the  difference  between  Y,,  and  Yg;  is  orthogonal  to  potential 
treatment  assignments. 

Ill  tlie  same  sex  example.  Restriction  2  amounts  to  saying  that  average  treatment  effects,  while  not 
constant,  are  nevertheless  the  same  regardless  of  a  woman's  likelihood  of  having  children.  Restriction  2 
rules  out  Roy  (195 1 )  type  selection,  where  treatment  status  is  determined  at  least  in  part  by  the  gains  from 
treatment.  In  the  case  of  childbearing,  for  example,  a  woman's  childbearing  decision  must  (somewhat 
implausibly)  be  independent  of  individual-level  variation  in  the  labor-supply  consequences  of  childbearing. 
On  the  plus  side.  Restriction  2  is  weaker  than  the  usual  constant-effects  assumption  in  that  it  does  not  require 
a  deterministic  link  between  Y,,  and  Y„,.'" 


'Abadie  (2002)  develops  a  number  of  related  bootstrap  specification  tests. 

'"Note  that  the  first  part  of  Restnction  2  is  sufficient  to  identify  the  effect  of  treatment  on  the  treated,  while 
the  second  part  is  sufficient  to  identify  the  effect  of  treatment  on  the  non-treated.  Although  conditional  constant 
effects  is  the  basis  of  much  empirical  work  and  may  be  a  reasonable  approximation  for  practical  purposes,  as  a 
theoretical  matter  this  is  typically  implausible  unless  treatment  is  exogenous;  see,  e.g..  Wooldridge  ( 1 997,  2003). 
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A  third  restriction,  which  I  call  "linearity,"  is  appealing  because  it  is  not  fundamentally  inconsistent 
with  a  benchmark  Roy-type  selection  model.  The  linearity  condition  is: 

Restriction  3  (linearity).  Poo=Poi;  Pio=Pii- 

In  this  case,  the  potential-outcomes  CEFs  can  be  written: 

E[Y„|Do„D„]  =  a, +  p,,(Doi  +  D„)  (10a) 

E[YoJ  Do,,  D„]  =  ao  +  Po,(D„i  +  0,0-  (10b) 

Restriction  3  requires  the  potential-outcomes  CEF  to  be  linear  in  Di*  =  Doi  +  D,;,  where  Dj*  is  a  summary 
measure  of  the  desire  or  suitability  of  an  individual  for  treatment.  If  the  restriction  is  false,  we  can 
nevertheless  think  of  (10a)  and  (10b)  as  providing  a  minimum  mean-squared  error  approximation  to  the 
unrestricted  model,  (9a)  and  (9b). 

To  see  how  average  causal  effects  are  identified  under  Restriction  3,  write  the  probabilities  of  being 
an  always-taker  and  never-taker  as 

P[Do,=D,i=l]  =  E[Doi]=p3 

P[Doi=D„=0]  =  E[l-D.,]=p„. 
and  note  that 

E[Doi  +  Dh]  =  1  +  (p3  -  pj. 
Substitute  into  ( 1  Oa)  and  ( 1  Ob)  and  difference  to  obtain 

E[Yh-Yo,]        =  [(a,  +  P„)  -  (ao  +  Po,)]  +  (Pn  -  Po.XPa  -  P„)  (H) 

=  E[Y„-Yo,|D„>D„,] 

+  {(E[Y„|  Do,=  l]-E[Y„|  D„>Do,])  -  (E[Yo,|  D,>Do,]-E[Yo,|  D„=0])}(p3  -  p„). 
The  components  on  the  right  hand  side  of  ( 1 1)  are  easily  estimated;  details  are  given  in  the  appendix. 

A  calculation  similar  to  that  used  to  derive  (11)  shows  that  the  effect  of  treatment  on  the  treated  can 
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be  constructing  using 

E[Y„  -  Yo,|  D,=  l]  =  [(a,  +  p„)  -  (ao  +  M]  +  (Pii  -  PoiXPa/Pa)  (12) 

=  E[Y„^Yo,|D„>Do,] 

+  {(E[Y„|  Do=l]-E[Y„|  D,  >Do,])  -  (E[Yo,|  D„>Do,]-E[Y„,|  D,  =0])}(p.yp,), 
where  p^  is  the  probability  of  treatment.  From  (12),  we  can  immediately  derive  the  Bloom  (1984)  result  that 
if  there  are  no  always-takers,  LATE  is  the  effect  on  the  treated." 

3. 1  Symmetry  Revisited 

Restriction  3  is  closely  related  to  the  symmetry  property  discussed  in  the  parametric  example.  To 
see  this,  note  that  as  a  consequence  of  linearity  we  can  interpolate  the  CEP  for  compilers  by  averaging  as 
follows: 

E[Y„  I  D„  >  D„,]  =  {E[Y„  I  Do,=  l]  +  E[Y„  |  D„=0]}/2.  (13) 

This  means  that  expected  outcomes  for  compilers  can  be  obtained  as  the  average  of  expected  outcomes  for 
always-  and  never-takers.  What  distributional  assumptions  support  a  relation  like  ( 1 3)?  Suppose  treatment 
is  determined  by  a  latent-index  assignment  mechanism,  as  in  equation  (2).  Then, 

E[YjD„  =  Do,=0]  =  E[Y„h,>Yo  +  Y,] 

E[YJD„  =  D„,=1]  =  E[YJti,<Yo], 
and 

E[Y^,  I  D„  >  Do,]  =  E[YJ  Yo  +  Y,  >  Tl,  >Yo]- 
If  in  addition,  Yi=-2yo,  then  equation  (13)  holds  as  long  as  (Yj„  t],)  is  jointly  symmetric,  as  in  the  Normal 
model.  The  restriction  Yi=-2yo  implies 

P[D,=  1|Z,=0]  =  P[ti,<y„]=1^/^  (14) 

P[D,=  l|Z,=  l]  =  Ph,<-Yo]=P 


"With  no  always-takers,  we  have  Dq.sO,  so  Restriction  3  is  not  binding. 
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for  some /?  e  [0,1]  so  the  first  stage  is  also  symmetric  (e.g,  a  first  stage  effect  of  .1  that  shifts  the  probability 
of  treatment  from  1-/7=. 45  top=.55). 

The  upshot  of  the  previous  discussion  is  that  a  symmetric  latent  error  distribution  and  a  symmetric 
first  stage  imply  the  interpolating  property,  (13),  or,  equivalently,  Restriction  3.  Moreover,  we  again  have 
LATE  equals  ATE  since  p=p„  given  the  first  stage  described  in  (14).''^  Intuitively,  a  symmetric  first-stage 
with  symmetrically  distributed  latent  errors  equates  LATE  with  ATE  because  average  treatment  effects  for 
individuals  with  characteristics  that  place  them  in  the  middle  of  the  r||  distribution  (compliers)  are 
representative  of  average  treatment  effects  for  individuals  over  the  entire  distribution  of  rji. 

A  first-stage  relationship  may  be  fortuitously  symmetric,  as  for  the  1990  Census  sample  of  teen 
mothers  using  the  same  sex  instrument.  In  such  cases,  it  seems  reasonable  to  invoke  Restriction  3  and 
proceed  under  the  assumption  that  LATE  equals  ATE.  But  what  if,  as  seems  more  typical,  the  first  stage 
shifts  the  probability  of  treatment  asymmetrically?  In  the  empirical  section,  I  describe  a  simple  scheme  for 
using  covariates  to  construct  a  subsample  with  a  symmetric  first  stage.  IV  should  estimate  average  treatment 
effects  in  this  specially  constructed  sample.  This  approach  naturally  raises  the  question  of  how  to  use 
average  treatment  effects  for  one  sample  to  make  inferences  about  average  effects  in  another.  For  a  recent 
attack  on  this  question,  see  Hotz,  Imbens  and  Klerman  (2000),  who  outline  a  procedure  designed  to 
extrapolate  the  results  from  randomized  trials  across  sites  with  different  populations.  Here  I  rely  on  the  fact 
that  if  effects  differ  little  between  two  samples  with  and  without  a  symmetric  first-stage,  then  given 
Restriction  3,  the  extrapolation  problem  is  solved  under  the  maintained  assumption  that  average  treatment 
effects  would  be  similar  in  the  symmetric  sample  and  its  complement. 


'^To  see  this,  note  that  P[D  =  1 1  Z=0]=1  -p  implies  p^  =  1  -p.  Since  p^  +  p„  +  {P[D,=1|  Z  =  1]-P[D,=  1|  Z=0]} 
=  (\^p)  +  Pn  +  (2;^-l)=l,  this  implies  p„=\-p.  As  noted  in  the  discussion  of  the  parametric  model,  LATE=ATE 
given  (14)  also  results  when  E[yjj  t|,]  is  an  odd  function  and  the  marginal  distribution  of  r|,  is  symmetric.  This  makes 
it  possible  to  have  a  relation  like  (13)  with,  say,  a  binary  or  otherwise  limited  dependent  variable  for  which  a 
symmetnc  distnbution  is  implausible. 
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3.2  Weakening  Restriction  3 

Suppose  again  that  treatment  assignment  can  be  modeled  using  equation  (2),  and  that  the  potential- 
outcomes  CEFs  are  linear  in  t],  (as  would  be  the  case  under  joint  Normality).    Then  we  can  write, 

E[Y„|  D„„  D„]  =  a,  +  p,E[ti,|  Do,,  D„]  (15a) 

E[Yo,|  D„„  D„]  =  a„  +  PoE[ti,|  Do,,  D„]  (15b) 

where 

Eh.l  Do,,  D„]  =  Eh.l  D„=0]  +  {Eh,|  D„,=  l]-Eh,|  D„=l,  Do,=0]}Do,  +  (16) 

{E[ti,|D„=1,Do,=0]-E[ti,|D„=0]>D„. 
Substituting  (16)  into  (1 5a)  and  (15b)  generates  an  expression  for  the  coefficients  in  (9a),  (9b).  This  leads 
to  the  following  generalization  of  Restriction  3; 

Restriction  4  (proportionality).  Poo=ePoi;  p, 0=6(3,1,  for  9>0. 

The  first  part  of  the  proportionality  restriction  comes  from  (15a,b)  alone.  Using  (16),  we  have 

0  =  {Ehil  Doi=l]-E[TiJ  D„=l,  D,;=0]}/{E[ti,|  D,-!,  Do,=0]-Eh.|  D.-O]},  (17) 

which  shows  why  0  is  positive. 

Restriction  4  leads  to  a  generalization  of  the  interpolation  formula  for  average  potential  outcomes. 
In  particular,  we  now  have 

E[Y^,  I  D„  >  Do,]  =  (1/(1+6))E[Y^,  |  Do,=  l]  +  (0/(l+0))E[Y^,  |  D„=0],  (18) 

so  that  if  6=0,  compilers  have  the  same  expected  potential  outcomes  as  always-takers,  while  as  6  approaches 
infinity,  compilers  have  the  same  expected  potential  outcomes  as  never-takers. 

The  linearity  assumption  used  to  motivate  Restriction  4  seems  most  plausible  in  the  context  of  a 
model  for  continuous  outcomes.  It  may  be  more  of  stretch,  however,  for  binary  outcomes  such  as  marital 
status.  On  the  other  hand,  without  covariates  the  distribution  of  r|,  is  arbitrary.  We  can  therefore  define  r|, 
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as  the  latent  error  tenn  in  an  assignment  mechanism  hi<;e  (2),  after  transformation  to  a  uniform  distribution 
on  the  unit  interval."  This  guarantees  that  ( 1 5a, b)  can  generate  fitted  values  for  outcome  CEFs  that  also  fall 
in  the  unit  interval.  Alternately,  the  weighted  average  in  (18)  can  be  motivated  directly  as  a  natural 
generalization  of  equally-weighted  interpolation  using  (13). 

To  develop  an  estimator  using  (18),  substitute  Restriction  4  into  (9a)  and  (9b)  to  obtain: 

E[Y„|  D„„  D„]  =  a,  +  p„(eDo,  +  D„) 

E[Yo,|D„„D„]  =  a„  +  po,(eDo;  +  D„). 
Differencing  and  averaging,  we  have 

E[Y,;-Yo,]         =[(a, +  P,,)-(ao  +  Poi)]+(Pn-Poi)(ep,-pJ  (19) 

=  E[Y„  -  Yo,  I  D„  >Doi  ] 

+  {6-'(E[Y„|  D,„=1]-E[Y„|  D,  >Do,])  -  (E[YJ  D,  >DJ-E[YJ  D„=0])}(ep,  -  p„). 
We  can  map  out  the  values  of  ATE  consistent  with  the  data  by  evaluating  (19)  for  alternative  choices  of  6. 
This  sensitivity  analysis  is  subject  to  the  caveat  that  at  the  extremes  where  6  equals  zero  or  infinity,  ATE  is 
not  identified,  a  fact  apparent  from  (18).''' 

An  alternative  to  sensitivity  analysis  is  to  try  to  estimate  6  using  (17).  Although  6  is  not  identified 
without  further  assumptions,  it  clearly  depends  in  large  part  on  the  first  stage  coefficients,  Yq  and  Yi.  This 
suggests  a  strategy  for  estimating  6  using  information  on  these  coefficients  only.  Suppose  that  ( 1 5a,b)  holds 
for  a  latent  error  transformed  to  Uniform  as  discussed  above,  or  that  the  CDF  of  ti,  can  be  approximated  by 
a  uniform  distribution  on  the  unit  interval.  Then  a  straightforward  calculation  gives 

0  =  [Yo  +  Y.]/[l  -Yo]  =  P(D,=  1|  Z,=  l)/[1  -P(D,=  1|  Z,=0)].  (20) 

This  has  the  property  that  6=1  when  P(D,=  1|  Z,=  1)=1-P(D=1|  Z=0),  while  capturing  deviations  from 
symmetry  in  a  straightforward  manner.  The  value  of  6  calculated  using  (20)  in  the  1990  Census  sample 


"The  requires  that  the  underlying  error  have  a  continuous  distribution. 

'""To  compute  the  effect  of  treatment  on  the  treated  under  Restriction  4,  replace  Op^-p^  with  Op^/pj  in  (19). 
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analyzed  here  is  .61 ,  close  to  the  value  calculated  using  Normality  (.58). 

Specification  Tests  for  Homogeneity  Restrictions 

Because  ATE  is  -  by  definition  -  invariant  to  the  particular  instalment  used  to  estimate  it, 
Restrictions  2, 3,  and  4  can  be  partly  checked  by  comparing  alternative  estimates  using  different  instruments. 
In  the  case  of  Restriction  2,  this  amounts  to  a  Sargan  (1958)  over-identification  test  comparing  alternative 
IV  estimates  of  the  same  structural  coefficient.  Under  Restrictions  3  and  4,  the  relevant  comparison  should 
use  equation  (19)  to  convert  estimates  of  LATE  into  estimates  of  ATE.  A  final  set  of  specification  tests  is 
suggested  by  the  fact  that  under  Restrictions  3  or  4, 

E[Y„  -  Y„,|  D,„  Do,]=  (ar  aj  +  (p,,  "  Po,)(9D„,  +  D„). 
A  test  of  whether  p,,  -  Po,  equals  zero  is  therefore  a  test  of  conditional  constant  effects,  while  a  test  of 
whether  p, ,  -  Poi  is  positive  is  a  test  for  Roy-type  selection  on  the  gains  from  treatment. 

4.  Childbearing,  Marital  Status,  and  Economic  Welfare 

The  same  sex  instrument  is  a  dummy  for  having  two  boys  or  two  girls  at  first  and  second  birth. 
Angrist  and  Evans  (1998)  showed  this  instrument  increases  the  likelihood  mothers  with  at  least  two  children 
go  on  to  have  a  third  child  by  about  6-7  percentage  points,  but  is  otherwise  uncorrected  with  mothers' 
demographic  characteristics.  The  data  set  used  here  is  the  1 990  Census  extract  used  in  the  Angrist  and  Evans 
paper.  This  sample  includes  mothers  aged  21-35  with  two  or  more  children,  the  oldest  of  whom  was  less 
than  18  at  the  time  of  the  Census. 

Descriptive  statistics  are  reported  in  Table  1  for  the  full  sample,  for  a  subsample  of  ever-married 
women,  and  for  four  subsamples  defined  by  mothers'  education  and  age  at  first  birth.  The  division  into 
subsamples  was  motivated  by  earlier  results  showing  markedly  different  effects  of  childbearing  by  maternal 
education,  and  because  of  the  policy  interest  in  teen  mothers.   The  probability  of  having  a  third  child  ranges 
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from  a  low  of  .33  in  the  sample  of  women  with  some  college,  to  a  high  of  .5  in  the  sample  of  teen  mothers. 
The  probability  of  having  a  same-sex  sibling  pair  is  more  or  less  constant  at  .505.  Some  of  the  estimates 
control  for  the  demographic  covariates  listed  in  Table  1  using  linear  models.'^  Means  for  the  outcome 
variables  of  interest  appear  at  the  bottom  of  Table  1. 

4.1  OLS,  IV,  and  2SLS  Estimates 

The  effect  oisame  sex  on  the  probability  of  having  a  third  child  varies  from  a  low  of  5.9  percentage 
points  in  the  some-college  sample  to  a  high  of  6.5  percentages  points  in  the  no-college  sample.  This  can  be 
seen  in  the  first  row  of  Table  2,  which  reports  first-stage  estimates.  The  first-stage  effect  without  covariates, 
E[D||  Z,=  l]— E[D,|  Z,=0]  =  E[D, -Do,],  is  also  an  estimate  of  the  proportion  of  the  population  in  the  compilers 
group. '^  As  a  benchmark,  the  next  two  rows  of  Table  2  show  estimates  of  the  effect  of  childbearing  on  two 
of  the  labor  supply  variables  studied  by  Angrist  and  Evans  (1998).  These  are  IV  and  OLS  estimates  from 
models  without  covariates,  i.e.,  Wald  estimates  and  simple  treatment-control  contrasts. 

The  Wald  (IV)  estimates  of  the  effect  of  a  third  child  on  employment  status  and  weeks  worked 
suggest  mothers  reduced  their  labor  supply  as  a  consequence  of  childbearing,  though  not  by  as  much  as 
indicated  by  the  OLS  estimates.  For  example,  women  who  had  a  third  child  were  about  1 3  percentage  points 
less  likely  to  work,  but  the  corresponding  IV  estimate  suggests  a  causal  effect  of  only  8  percentage  points. 
The  OLS  and  IV  estimates  for  weeks  worked  are  about  -  7  and  -  5.  The  IV  estimates  of  labor  supply  effects 
are  larger  for  less-educated  women  than  for  those  with  some  college;  in  fact,  the  labor  supply  estimates  are 


"See  Abadie  (2003)  and  Frolich  (2002)  for  nonlinear  causal  models  with  covariates. 

'^Although  we  cannot  identify  individual  compilers  in  any  sample  and  tabulate  their  characteristics  directly, 
it  nevertheless  is  possible  to  use  data  to  describe  the  distribution  of  compiler  characteristics,  and  to  compare  this  to 
the  unconditional  distribution.  In  particular,  the  difference  in  first-stage  estimates  across  samples  defined  by 
covariates  characterizes  the  distribution  of  covariates  among  compilers.  To  see  this,  note  that  for  a  binary  covariate, 
x„  E[-x,|  D,>D(,,]/E[x,]  =  E[D,,-D|,,|  .t  =  l  ]/E[D| -D(,,].  Table  2  therefore  also  shows  same  sex  compilers  to  be  less 
educated  and  more  likely  to  have  been  married  than  the  overall  average. 
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not  significant  in  the  some-college  sample.  In  contrast,  the  IV  estimates  are  smaller  for  women  who  had  their 
first  birth  as  teenager  than  for  women  who  had  their  first  birth  as  an  adult. 

The  last  two  rows  in  Table  2  show  first-stage,  OLS  and  two-stage  least  squares  (2SLS)  estimates 
after  adding  controls  for  age,  age  at  first  birth,  dummies  to  indicate  first-born  and  second-bom  boys,  race 
dummies,  and  dummies  for  three  schooling  groups.  Since  same  sex  is  uncorrelated  with  these  covariates, 
including  them  has  little  effect  on  the  2SLS  estimates.  Moreover,  in  spite  of  the  fact  that  some  of  the 

covariates  are  good  predictors  of  outcomes,  estimates  with  covariates  are  only  slightly  more  precise 
than  those  without.  Perhaps  more  surprisingly,  the  OLS  estimates  of  labor  supply  effects  also  change  little 
in  response  to  the  addition  of  covariates. 

Estimates  of  the  effect  of  having  a  third  child  on  marital  status,  poverty  status,  and  welfare  use  are 
reported  in  Table  3  for  models  with  and  without  covariates.  In  the  sample  of  all  women,  those  with  more 
children  are  less  likely  to  be  married.  But  this  is  at  least  in  part  due  to  uncontrolled  demographic  factors  such 
age  at  first  birth,  since  OLS  estimates  with  controls  show  that  additional  childbearing  is  associated  with  an 
increase  in  the  likelihood  of  being  married.  In  contrast  to  the  OLS  estimates,  IV  estimates  with  or  without 
covariates  suggest  that  the  causal  effect  of  childbearing  is  a  reduced  probability  of  being  married.  Thus,  an 
important  finding  is  that  when  the  effect  of  childbearing  is  estimated  in  models  with  demographic  controls, 
rv  and  OLS  estimates  have  opposite  signs. 

The  most  important  change  in  marital  status  caused  by  childbearing  appears  to  be  an  increase  in  the 
likelihood  of  being  divorced  or  separated.  The  estimated  effects  of  childbearing  on  the  probability  of  being 
ever-married  or  divorced  (but  not  separated)  are  not  significantly  different  from  zero.  Consistent  with  an 
increase  in  marital  breakup,  the  birth  of  a  third  child  also  appears  to  lead  to  a  marked  increase  in  the 
likelihood  a  woman  lives  in  a  family  with  total  family  income  below  the  poverty  line.  Here  we  should  expect 
at  least  a  mechanical  effect  since  the  poverty  threshold  falls  as  family  size  increases.  Although  OLS 
estimates  are  larger  than  IV  estimates  in  models  without  covariates,  OLS  and  IV  estimates  in  models  with 
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covariates  both  indicate  that  a  third  child  increases  the  hicehhood  a  woman  is  poor  by  9-10  percentage  points. 

Given  the  elevated  rates  of  marital  breakup  and  the  increase  in  poverty  rates  that  appear  to  be  caused 
by  childbearing,  it  seems  reasonable  to  expect  that  the  birth  of  a  third  child  also  increases  the  likelihood  a 
woman  is  on  welfare.  Both  the  IV  and  OLS  estimates  tend  to  support  this,  though  the  IV  estimates  are 
imprecise.  The  OLS  estimate  of  the  effect  on  welfare  use  range  from  6.7  percentage  points  without 
covariates  to  3.9  percentage  points  with  covariates.  The  IV  estimate  is  a  marginally  significant  3.3  %  with 
or  without  covariates.  While  small  in  levels,  an  effect  of  this  magnitude  represents  a  roughly  one-third 
increase  in  the  number  of  women  on  welfare. 

The  IV  estimates  show  no  relationship  between  childbearing  and  the  probability  a  woman  has  ever 
been  married,  so  estimates  limited  to  the  sample  of  ever-married  women  are  unlikely  to  be  affected  by 
selection  bias.  Not  surprisingly,  therefore,  the  IV  estimates  in  the  sample  of  ever-married  women  are  almost 
identical  to  those  in  the  full  sample.  On  the  other  hand,  while  the  IV  estimate  of  the  reduction  in  marriage 
rates  is  a  significant  8  percentage  points  (s.e.=.028)  for  women  with  no  college,  it  is  close  to  zero  and 
insignificant  for  women  with  some  college.  The  effects  of  childbearing  on  poverty  are  also  larger  in  the  no- 
college  sample,  though  the  difference  in  effects  on  welfare  use  by  college  status  is  reversed  and  much  smaller 
than  the  difference  in  effects  on  poverty  rates. 

The  difference  in  estimates  by  mothers'  age  at  first  birth  also  suggest  a  pattern  of  larger  effects  with 
decreasing  socioeconomic  status,  though  the  contrast  is  not  as  clear  cut  as  the  differences  by  schooling  group. 
While  the  increases  in  marital  dissolution  and  welfare  receipt  are  larger  for  teen  mothers  than  for  adult 
mothers,  the  estimates  are  significant  only  in  the  latter  group.  Estimates  of  effects  on  divorce/separation  are 
similar  in  the  two  groups,  though  again  much  more  precise  for  the  sample  of  adult  mothers.  This  difference 
in  precision  undoubtedly  reflects  the  smaller  sample  of  teen  mothers.  One  clear  contrast,  however,  is  the 
higher  likelihood  that  a  third  birth  pushes  a  teen  mother  into  poverty.  The  impact  on  poverty  status  is 
significant  regardless  of  mothers'  age  at  first  birth,  but  it  is  roughly  three  times  larger  for  teen  mothers. 
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4.2  Heterogeneity  across  potential-assignment  subpopiilations 

The  first-stage  estimates  imply  that  6-7  %  of  each  sample  consists  of  compliers,  i.e.,  mothers  who 
had  a  child  in  response  to  a  homogenous  sibling-sex  mix.  Because  the  overall  probability  of  treatment  ranges 
upwards  from  about  .32,  the  overwhelming  majority  of  treated  individuals  are  al  ways-takers.  This  can  be 
seen  in  Table  4,  which  gives  the  distribution  of  potential-assignment  subpopulations.  In  the  sample  of  all 
women,  for  example,  6.3  %  are  compliers,  34  %  are  always-takers  (i.e.,  have  a  third  child  without  regard  to 
sibling-sex  composition),  and  59  %  are  never-takers  (i.e.,  would  never  have  a  third  child  regardless  of 
sibling-sex  composition).  The  proportion  of  treated  who  are  compliers  is  1  -  (Pj/pj),  or  about  8  %.  Given  the 
relatively  small  proportion  of  compliers,  the  scope  for  differences  in  average  causal  effects  across  potential- 
assignment  subpopulation  is  substantial. 

Table  4  also  reports  the  estimate  of  (p^-p„),  the  multiplier  that  determines  how  far  LATE  is  from 
ATE  when  the  latter  is  calculated  using  Restriction  3  and  equation  (11),  or  in  models  with  covariates  as 
described  in  the  Appendix.  The  estimate  of  (p^-pj  is  -.25  in  the  full  sample,  and  ranges  from  0  for  teen 
mothers  to  -.352  in  the  sample  of  adult  mothers. 

4.2.1  Symmetric  subpopulations 

The  value  of  zero  for  (p^^p,,)  in  the  teen  mother  sample  is  noteworthy  because  it  means  that  LATE 
is  the  same  as  ATE  under  Restriction  3.  This  is  a  consequence  of  the  fact  that  the  first  stage  for  teen  mothers 
is  almost  perfectly  symmetric:  the  same  sex  instrument  shifts  the  probability  of  further  childbearing  from 
about  .47  to  .53.  Moreover,  because  6  for  teen  mothers  is  about  1  when  estimated  using  (20),  estimates  of 
ATE  for  teen  mothers  under  Restriction  4  are  also  close  to  LATE. 

The  first  two  columns  of  Table  5  focus  on  the  comparison  between  estimates  for  all  women  and  teen 
mothers  only,  repeating  earlier  estimates  for  these  samples  from  Table  3  without  covariates,  including  the 
first-stage  coefficient  and  intercept.  For  the  most  part,  IV  estimates  for  teen  mothers  are  similar  to  those  for 
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the  sample  of  all  women.  While  the  estimated  effect  on  employment  is  considerably  lower  at  -.026 
(s.e.=.051)  versus  .084  (s.e.=.027)  in  the  full  sample,  the  effect  of  childbearing  on  weeks  worked  is  -5.2 
(s.e.=l  .3)  in  the  full  sample  and  -4.8  (s.e.=2.4)  for  teen  mothers.  Similarly,  the  effect  on  marital  status  is 
-.062  (s.e.=.024)  in  the  full  sample  and -.066  for  teen  mothers  (.051).  Note  that  we  can  view  the  parameters 
estimated  in  the  full  sample  as  estimates  of  E[Y,i- YoJ  D|i>D(,j,  X=all  women],  while  the  estimates  in  column 
2,  for  teen  mothers,  can  be  interpreted  as  measuring  E[Y,j- YqjI  X=teen  mothers]  under  Restriction  3  or  4. 
A  test  of  equality  across  columns  1  and  2  is  therefore  a  joint  test  of  the  invariance  of  average  treatment 
effects  to  conditioning  both  on  X  and  on  the  compilers  potential-outcomes  subpopulation.  The  fact  that  these 
are  similar  is  evidence  against  substantial  treatment  effect  heterogeneity  in  both  dimensions,  though  of  course 
there  are  scenarios  where  this  test  has  no  power. '^ 

There  is  some  evidence  for  a  difference  in  effects  on  poverty  status  between  the  teen  mother  and  all- 
women  samples.  For  all  women,  the  IV  estimate  of  the  effect  of  childbearing  on  poverty  status  is  .095 
(s.e.=.023),  while  the  corresponding  estimate  is.  143  (s.e.=.05)  in  the  teen  mother  sample.  The  comparison 
across  samples  is  weakened,  however,  by  the  fact  the  estimates  in  the  teen  mother  sample  are  much  less 
precise  than  in  the  full  sample.  This  raises  the  question  of  whether  we  can  construct  a  larger  sample  with 
a  symmetric  first  stage.  I  attempted  to  construct  such  a  sample  by  estimating  a  Probit  first-stage  allowing 
interactions  with  covariates  and  then  selecting  the  sample  based  on  covariate-specific  fitted  values.'* 

The  details  of  the  symmetric  sample  selection  are  as  follows.  The  idea  is  to  use  a  parametric  model 
capture  the  variation  in  the  first-stage  effect  of  same  sex  on  childbearing  with  demographic  covariates.  The 
model  allows  for  a  large  set  of  interaction  terms  with  covariates.  I  then  look  for  covariate  values  where  the 
predicted  first-stage  effect  is  symmetric  in  the  sense  required  by  Restriction  3.  I  began  with  a  Probit  first- 


"As  with  an  over-identification  test,  the  power  of  the  test  turns  on  maintaining  the  validity  of  a  benchmark. 
Here,  we  maintain  £[¥,,- Y(,J  X=teen  mothers]=E[Y,,- Y„,]. 

'*A  maintained  assumption  here  is  that  the  distribution  of  yj,  and  r|j  is  jointly  symmetric  conditional  on  the 
covariates  used  to  select  the  sample  with  a  symmetric  first  stage. 

26 


stage  equation: 

P[D,=  1|Z,.X,]  =  <D[ko'X,  +  (k,'X,)Z,],  (21) 

where  X,  is  a  vector  of  covariates  that  includes  age,  age  at  first  birth,  Black  and  Hispanic  dummies,  and 
dummies  indicating  women  with  some  college  and  college  graduates.  The  main  effects,  Ko'X,,  and  interaction 
terms,  k,  'X,,  use  the  same  parameterization  of  covariate  effects  (in  particular,  they  both  allow  for  linear  terms 
in  the  age  variables  plus  main  effects  for  the  dummies).  In  practice,  K(,'X,  takes  on  about  1,700  distinct 
values.  For  each  of  these  values,  I  calculated 

i^„(XO^   <J.[k„'X,], 
the  distribution  of  which  is  plotted  in  Fig.  3.  This  gives  the  distribution  of  the  probability  of  childbearing 
for  women  with  different  X-characteristics  and  Z,  equal  to  zero.  The  distribution  of  k„(X,)  is  concentrated 
around  the  overall  average  of  about  .34,  though  there  is  considerable  spread. 

By  definition,  a  symmetric  first  stage  shifts  the  probability  of  treatment  across  the  value  of  one-half 
To  identify  a  sample  where  this  is  most  likely,  I  initially  selected  women  with  iioP^d  between  .4  and  .6. 
Column  3  of  Table  5  reports  estimates  for  this  sample,  which  has  about  104,000  obsen.'ations.  The  estimated 
first-stage  in  this  sample  shifts  the  probability  of  treatment  from  .47  to  .54,  i.e.,  approximately  from/?  to  \—p, 
as  required  by  symmetry.  For  most  outcomes,  the  IV  estimates  in  this  symmetric  sample  are  smaller  in 
absolute  value  than  in  the  full  sample,  and  smaller  than  in  the  sample  complementary  to  the  symmetric 
sample,  for  which  results  are  reported  in  column  4.  For  example,  the  estimated  effect  on  weeks  worked  in 
the  symmetric  sample  is  -3.7  (s.e.=2.2),  while  the  corresponding  estimate  in  the  complementary  sample  is 
-6  (s.e.=1.6).  Again,  however,  the  comparison  is  handicapped  by  a  lack  of  precision. 

The  long  right  tail  of  the  distribution  of  first-stage  base  values  plotted  in  Fig.  3  suggests  that  an  even 
larger  symmetric  sample  can  be  constructed  simply  by  dropping  values  of  7:y(X,)  beginning  from  the  left  and 
working  up.  As  it  turns  out,  limiting  the  sample  to  individuals  with  values  of  7to(X|)  greater  than  or  equal  to 
.35  leads  to  a  first  stage  that  shifts  the  probability  of  treatment  from  .465  to  .533,  virtually  perfectly 
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symmetric.  This  can  be  seen  in  column  5  of  Table  5,  which  reports  first-stage  and  IV  estimates  for  the 
resulting  sample  of  1 62,264  observations.  Most  of  the  estimates  in  this  symmetric  sample  are  close  to  those 
in  the  full  sample.  For  example,  the  effect  on  weeks  worked  is  -5.9  (s.e.=  1.8)  and  the  effect  on  divorce  or 
separation  is  .068  (s.e.=.032).  Perhaps  surprisingly,  the  estimated  effect  on  poverty  status  differs  markedly 
between  this  sample  and  its  complement  (.136  versus  .048),  but  the  estimated  effect  is  still  significantly 
different  from  zero  in  the  complementary  sample. 

4.2.2  Imputation  of  ATE 

The  results  in  Table  5  reflect  an  attempt  to  identify  or  construct  samples  where  LATE=ATE. 
Alternately,  we  can  use  equati  ons  ( 1 1 )  or  ( 1 9)  to  impute  a  value  of  ATE  for  the  various  subsamples  analyzed 
in  Table  3.  The  results  of  this  effort  are  presented  in  Table  6  for  four  outcomes;  this  table  also  reports  the 
no-selection  alternative  used  to  construct  the  specification  test  discussed  at  the  beginning  of  Section  3.2. 
The  estimates  of  the  no-selection  alternative  are  all  slightly  farther  from  the  estimates  of  LATE  than  the 
corresponding  OLS  estimates.  For  example,  the  OLS  estimate  of  the  effect  on  weeks  worked  in  the  full 
sample  is  -7.34  (s.e.=.08),  while  the  no-selecdon  alternative  is  -7.56  (s.e.=.12).  This  suggests,  as  noted 
earlier,  that  the  contrast  between  IV  and  the  no-selection  alternative  provides  a  more  powerful  specification 
test  than  a  conventional  IV/OLS  comparison. 

Estimates  of  ATE  constructed  using  equation  (1 1)  for  the  effect  of  childbearing  on  weeks  worked 
are  similar  to  the  estimates  of  LATE,  even  in  samples  where  the  first-stage  is  not  symmetric.  For  example, 
the  estimate  of  ATE  for  the  sample  of  non-teen  (i.e.,  adult)  mothers  is -4.1  (s.e.=  1.5),  in  comparison  with  an 
estimate  of  LATE  of -5.3  (s.e.=l  .6).  Using  the  estimates  of  6  shown  in  Table  4  and  equation  (19)  generates 
somewhat  smaller  estimates  for  the  effect  on  weeks  worked  other  than  in  the  teen  mother  sample,  though 
again  mostly  still  significant. 

Estimates  of  ATE  for  outcomes  other  than  weeks  worked  are  mostly  insignificant.  This  contrasts 
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with  the  mostly  significant  estimates  of  LATE.  Again,  this  is  partly  a  problem  of  precision.  But  the 
estimates  of  ATE  outside  the  teen  mother  sample  move  substantially  closer  to  zero  than  the  estimates  of 
LATE.  For  example,  while  LATE  suggests  the  probability  of  divorce  or  separation  increases  by  .053 
(s.e.=.01 9),  the  corresponding  estimates  of  ATE  are  .028  (s.e.=.01 9)  when  6  equals  one  and  .009  (.s.e.=.02 1) 
when  8  is  estimated.  The  evidence  that  further  childbearing  increases  divorce  or  separation  for  the  typical 
woman  with  two  children  is  therefore  weaker  than  the  estimates  of  LATE  would  suggest.  Except  for  the 
sample  of  teen  mothers,  the  estimates  of  ATE  for  effects  on  poverty  status  are  also  smaller  than  the 
corresponding  estimates  of  LATE. 

5.  Summary  and  Conclusions 

The  framework  outlined  here  provides  a  strategy  for  modeling  treatment  effect  heterogeneity  across 
potential-assignment  subpopulations.  1  focused  initially  on  restrictions  that  make  IV  estimates  of  causal 
effects  on  compliers  representative  of  the  overall  population  average  treatment  effect.  The  framework  also 
leads  to  procedures  that  can  be  used  to  impute  average  treatment  effects  from  information  on  average 
outcomes  for  compliers,  always-takers,  and  never-takers.  An  illustration  of  these  ideas  using  same  sex 
instruments  suggests  this  approach  may  be  useful  in  applied  work. 

On  the  empirical  side,  estimates  of  LATE  for  teen  mothers  are  close  to  the  corresponding  average 
treatment  effects  for  this  population,  when  the  latter  are  inferred  using  a  number  of  linearity  or 
proportionality  assumptions.  And  while  estimates  of  the  overall  average  effect  of  childbearing  are  smaller 
than  the  corresponding  IV  estimates,  most  of  the  estimated  effects  on  labor  supply  and  poverty  status  remain 
substantial  and  significant.  On  the  other  hand,  most  (though  not  all)  of  the  estimated  average  effects  on 
marital  status  and  welfare  use  are  small  and  insignificant. 

Estimates  of  the  effects  of  childbearing  on  marital  stability  and  welfare  use  using  the  same  sex 
instrument  suggest  the  outline  of  a  coherent  picture,  but  many  features  remain  unresolved.     In  this 
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application,  the  theory  of  parameter  heterogeneity  runs  quickly  into  the  sandpile  of  sampling  variance  and 
specification  uncertainty.  On  balance,  I  think  extrapolation  efforts  of  the  sort  implemented  here  are  more 
likely  to  weaken  the  case  for  the  predictive  value  of  a  particular  causal  estimate  than  to  provide  a  concrete 
and  precise  alternative  to  traditional  IV.  For  example,  the  evidence  for  an  adverse  effect  of  childbearing  on 
marital  stability  and  welfare  use  is  clearly  weakened  by  the  attempt  to  go  from  LATE  to  ATE.  This  sort  of 
destructive  evidence  seems  to  me  to  be  a  prominent  feature  of  life  in  the  empirical  world.  The  external 
validity  of  IV  estimates  is  ultimately  established  less  by  new  econometric  methods  than  by  replication  in  new 
data  sets,  and,  of  course,  by  new  instalments. 
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APPENDIX:  COMPUTATION 

1.  The  test  for  selection  bias. 

Drop  individual  subscripts  from  the  notation.  Consider  the  following  two-equation  system: 

Y  =  Ao  +  A,D  +  |i  (Al) 

Y  =  6„  +  6,(1  [D=Z])  +  6,((D-Z)/2)  +  v  (A2) 

The  test  for  selection  bias  is  a  test  of  whether  A ,  =  6,  when  ( A 1 )  is  estimated  by  IV  using  Z  as  an  instrument 
and  A2  is  estimated  by  OLS.  The  two  coefficients  and  the  asymptotic  standard  error  for  their  difference  can 
be  estimated  by  stacking  A 1  and  A2  and  allowing  for  heteroscedastic  and  correlated  residuals.  In  practice, 
for  sample  sizes  on  the  order  of  that  used  here,  it  seems  reasonable  to  treat  the  estimate  of  62  as  non- 
stochastic  and  use  the  standard  eiTor  of  the  estimate  of  A,  to  construct  a  t-test. 

2.  Estimates  under  Restriction  3. 
Use  (A2)  to  write: 

E[Y,|  D=l,  Z=0]  =  E[Y||  D,=D„=1]=  6„  +  6,/2 

E[Yo|  D=0,  Z=l]  =  E[Y||  D,=Do=0]=  6^  -  hJl. 

Estimates  of  E[Y,||  D,  >  D,,]  and  E[Y||  D,  >  D„]  can  be  obtained  as  IV  estimates  of  the  coefficients  Ag,  and 
A,,  in  (A3)  and  (A4),  below: 

DY  =  A,o  +  A,,D  +  n,  (A3) 

(l-D)Y  =  Ao„  +  A„,(l-D)  +  (io.  (A4) 

Estimates  of  ATE  under  Restriction  3  area  linear  combination  of  60,  6,,  Aoi,and  A,,.  These  coefficients  and 
the  standard  error  for  any  linear  combination  of  them  can  be  estimated  by  stacking  A2,  A3,  and  A4. 
To  further  simplify,  rewrite  equation  ( 1 1 )  in  terms  of  the  parameters  in  A2-A4  as 

E[Y,  -  Yo]  =  A„[l-(p,-pJ]  -  Ao,[l+(p,-p„)]  +  26„(p,-pJ.  (A5) 

To  accommodate  models  with  covariates,  it  is  convenient  to  use  a  regression  set-up  to  estimate  p^-  p„.  Define 
a  dependent  variable  d'=D(2Z- 1 )-  Z.  Regress  d"  on  Z;  the  coefficient  on  Z  is  an  estimate  of  p^-  p„.  Note  that 
(without  covariates)  the  standard  error  for  the  estimated  Pa-p„  is  the  same  as  the  standard  error  for  the  first- 
stage  coefficient  since  the  latter  can  be  written  1  -p^-p^.  To  estimate  E[Y,  -  Yq|  D=1],  replace  p^-Pn  with 
Pa/Pd  in  (A5). 

Models  with  covariates  were  estimated  by  adding  covariates  to  the  relevant  first-stage  equations,  and 
to  equations  A 1 -A4.  As  a  shortcut  for  inference  for  estimates  of  ATE  using  A5,  it  seems  reasonable  to  treat 
(Pa^Pn)  '^"d  6(1  as  known  since  these  are  estimated  much  more  precisely  than  A,,  and  A,,,,  which  are 
themselves  instrumental  variables  estimates.  Note  also  that  IV  estimates  of  A,,  and  Aq,  are  independent. 
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3.  Estimates  under  Restriction  4. 

Substitute  parameters  from  A1-A4  into  (19)  and  simplify  to  obtain 

E[Y,  -  Yo]  =  A,,[l-(p,-e-'p„)]  -  Ao,[l+(ep,-pJ]  +  [6„(l+0-')  + (62/2X6-'- l)](ep3-pj.         (A6) 
Standard  errors  were  calculated  treating  P3,  p„,  6q,  62,  and  9  as  known. 
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Panel  A:  Moderate  selection  on  gains 
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Fig.  1 :  The  relationship  between  LATE,  ATE  and  the  effect  on  the  treated  (TT)  for  alternate  first- 
stage  baseline  values.   The  first-stage  effect  is  fixed  at  .07  and  ATE=0.  The  top  panel  calculation 
sets  the  correlation  between  gains  and  the  treatment  index  to  -.1,  while  the  bottom  panel  sets  this 
correlation  to  -.5. 
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Fig.  2:  The  relationship  between  LATE,  ATE  and  the  effect  on  the  treated  (TT)  for  alternate  first- 
stage  baseline  values.  The  first-stage  effect  is  fixed  at  .30  and  ATE=0.  The  top  panel  calculation 
sets  the  correlation  between  gains  and  the  treatment  index  to  -.1,  while  the  bottom  panel  sets  this 
correlation  to  -.5. 
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some  college  and  college  graduates.  There  are  about  1,700  values  in  the  histogram. 


TABLE  1 :  DESCRIPTIVE  STATISTICS,  WOMEN  AGED  21-35  WITH  2  OR  MORE  CHILDREN 


Means  and  Standard  Deviations 

All 

Ever 

No 

Some 

Teen 

Adult 

Variable 

Women 

Married 

College 

College  or  + 

Mothers 

Mothers 

Children  ever  born 

2.50 

2.49 

2.55 

2.41 

2.72 

2.41 

(0.76) 

(0.75) 

(0.81) 

(0.68) 

(0.90) 

(0.68) 

More  than  2  children  (=1  if  mother 

0.375 

0.370 

0.405 

0.328 

0.500 

0.324 

had  more  than  2  children) 

(0.484) 

(0.483) 

(0.491) 

(0.470) 

(0.500) 

(0.468) 

Boy  1st  (=1  if  1st  child  was  a  boy) 

0.512 

0.512 

0.510 

0.514 

0.509 

0.513 

(0.500) 

(0.500) 

(0.500) 

(0.500) 

(0.500) 

(0.500) 

Boy  2nd  (=1  if  2nd  child  was  a  boy) 

0.511 

0.511 

0.509 

0.512 

0.508 

0.511 

(0.500) 

(0.500) 

(0.500) 

(0.500) 

(0.500) 

(0.500) 

Two  boys  (=1  if  first  two  children 

0.264 

0.264 

0.262 

0.266 

0.262 

0.264 

were  boys) 

(0.441) 

(0.441) 

(0.440) 

(0.442) 

(0.440) 

(0.441) 

Two  girls  (=1  if  first  two  children 

0.241 

0.241 

0.242 

0.240 

0.245 

0.240 

were  girls) 

(0.428) 

(0.427) 

(0.428) 

(0.427) 

(0.430) 

(0.427) 

Same  sex  (=1  if  first  two  children 

0.505 

0.505 

0.504 

0.506 

0.507 

0.504 

were  the  same  sex) 

(0.500) 

(0.500) 

(0.500) 

(0.500) 

(0.500) 

(0.500) 

Age 

30.4 

30.6 

29.92 

31.3 

28.8 

31.1 

(3.5) 

(3.4) 

(3.60) 

(3.0) 

(3.89) 

(3.0) 

Age  at  first  birth  (mother's  age 

21.8 

22.0 

20.85 

23.4 

17.9 

23.5 

when  first  child  was  bom) 

(3.5) 

(3.5) 

(3.16) 

(3.50) 

(1.13) 

(2.8) 

Black  Mother 

0.131 

0.092 

0.141 

0.115 

0.237 

0.088 

(0.337) 

(0.289) 

(0.348) 

(0.319) 

(0.425) 

(0.283) 

Hispanic  Mother 

0.113 

0.112 

0.144 

0.065 

0.156 

0.096 

(0.317) 

(0.315) 

(0.351) 

(0.246) 

(0.363) 

(0.294) 

Never  Married 

0.068 

0.089 

0.036 

0.144 

0.037 

(0.252) 

- 

(0.285) 

(0.186) 

(0.351) 

(0.190) 

Married  Now 

0.798 

0.857 

0.768 

0.846 

0.649 

0.859 

(0.401) 

(0.350) 

(0.422) 

(0.361) 

(0.477) 

(0.348) 

Divorced 

0.081 

0.087 

0.083 

0.079 

0.122 

0.065 

(0.273) 

(0.282) 

(0.276) 

(0.270) 

(0.327) 

(0.246) 

Divorced  or  Separated 

0.127 

0.137 

0.136 

0.114 

0.197 

0.099 

(0.333) 

(0.344) 

(0.343) 

(0.317) 

(0.398) 

(0.299) 

High  School  Graduate  (=1  if  high  school 

0.420 

0.423 

0.685 

- 

0.426 

0.417 

diploma  and  no  further  education) 

(0.494) 

(0.494) 

(0.465) 

- 

(0.495) 

(0.493) 

Some  College  (=1  if  some  college,  but 

0.264 

0.269 

- 

0.682 

0.174 

0.301 

no  degree) 

(0.441) 

(0.444) 

- 

(0.466) 

(0.379) 

(0.458) 

College  Graduate  {-]  if  bachelor's  degree 

0.123 

0.131 

- 

0.318 

0.018 

0.166 

or  higher) 

(0.329) 

(0.338) 

- 

(0.466) 

(0.131) 

(0.372) 

In  Poverty  (=1  if  family  income 

0.197 

0.158 

0.256 

0.103 

0.347 

0.136 

below  the  poverty  line) 

(0.398) 

(0.364) 

(0.437) 

(0.303) 

(0.476) 

(0.343) 

Welfare  Recipient  (=1  if  public 

0.098 

0.065 

0.130 

0.047 

0.187 

0.062 

assistance  income>0) 

(0.297) 

(0.247) 

(0.336) 

(212) 

(0.390) 

(0.241) 

Worked  for  pay  (=1  if  worked  for 

0.662 

0.674 

0.623 

0.723 

0.650 

0.667 

pay  in  1989) 

(0.473) 

(0.469) 

(0.485) 

(0.447) 

(0.477) 

(0.471) 

Weeks  worked  (weeks  worked 

26.2 

26.9 

24.27 

29.3 

25.0 

26.7 

in  1989) 

(22.9) 

(22.9) 

(23.00) 

(22.4) 

(22.81) 

(22.9) 

Number  of  observations 

380007 

357063 

236418 

143589 

110156 

269851 

Notes:  Data  are  from  the  1990  PUMS.  The  sample  includes  women  with  2  or  more  children  whose  2nd  child  was  at  least  age  1  and  who  had  iheir  first  birth 
at  age  1 5  or  later.  The  no-college  sample  includes  women  with  no  college  or  with  an  associate  occupational  degree.  The  some-college  sample  includes 
women  with  an  associate  academic  degree,  some  college  but  no  degree,  or  a  college  degree.  Teen  mothers  are  those  who  had  their  first  birth  at  age  19  or 
younger.  Standard  deviations  are  reported  in  parentheses.  All  calculations  use  sample  weights. 


TABLt  2   FIRST-STAGE  AND  LABOR  SUPPLY  ESTIMATES 


All  Women 

Ever  M 

amed 

No  College 

Some  College  or  + 

Teen  Mothers 

Adult  Mothers 

Variable 

OLS 

IV 

OLS 

IV 

OLS 

IV 

OLS 

IV 

OLS 

\\ 

OLS 

IV 

A.  No  Covariates 

Firsl  Sla^c 

More  than  2  children 

00628 
(0.002) 

0.0663 
(0.002) 

- 

0.0652 
(0  002) 

0.0594 
(0.003) 

0  0638 
(0.003) 

0.0616 
(0024) 

Outcomes 

Worked  for  pav 

■0.132 

■0.084 

■0126 

■0.083 

■0.132 

■0.105 

-0  112 

■0.057 

-0.140 

•0026 

■0,130 

■01 09 

(0.002) 

(0.027) 

(0.002) 

(0.026) 

(0.002) 

(0.034) 

(0003) 

(0.044) 

(0.003) 

(0  051) 

(0.002) 

(0.032) 

If  Vets  Harked 

■7.34 

-5.15 

-7.12 

■  5.09 

■7.22 

■6.52 

■6,59 

■3.21 

■  7.47 

-4.76 

-7.19 

-5.26 

(0.08) 

(130) 

(0.09) 

(1.27) 

(0.11) 

(1.60) 

(0,14) 

(2.18) 

(0.15) 

(2.40) 

(0.10) 

(1.57) 

B.  With  Covariates 

First  Sta^ 

More  than  2  children 

0.0523 

0.0658 

0.0644 

- 

00592 

00633 

- 

0.0623 

(0  0017) 

(0,0017) 

(0.0022) 

- 

(00027) 

(0.0033) 

(00019) 

- 

Outcomes 

Worked  for  pav 

■0.148 

-0.097 

■0.148 

■0.097 

■0.144 

-0.120 

■0.151 

-0.060 

•0.132 

-0071 

■0.154 

■0.108 

(0002) 

(0.027) 

(0.002) 

(0  026) 

(0.002) 

(0.034) 

(0.003) 

(0.(M3) 

(0.003) 

(0.049) 

(0.002) 

(0.031) 

Weeks  xiorked 

■8.33 

•5,93 

■8  43 

■5.92 

■7.92 

■7.43 

■884 

■3.45 

•7.51 

•7.55 

•8.62 

-5.30 

(0.09) 

(1.27) 

(0.09) 

(1.24) 

(Oil) 

(1.57) 

(0.14) 

(2.15) 

(0.15) 

(2.28) 

(0,10) 

(1.52) 

Notes  The  table  reports  OLS  and  IV  csiimatcs  Liflhc  coefficient  on  the  Afore  than  2  children  variable   The  IV  cslimales  use  5omc  sex  as  an  instrument  The  covariates  included  in  panel  B 
arc  Age.  Age  ulfirai  birth,  and  dummies  for  Boy  ht.  Boy  2nd.  Black.  Hispanic,  Other  race.  High  school  graduate.  Some  college  and  College  graduate   The  samples  arc  the  same  as  in 
Table  I.  Standard  errors  arc  tcporled  in  parentheses,  Al)  calculations  use  sample  weights. 


TABLE  3:  EFFECTS  ON  MARITAL  STAUTUS,  POVERTY  STATUS  AND  WELFARE  USE 


All  W 

omen 

Ever  M 

larried 

No  College 

Some  Col 

lege  or  + 

Teen  M 

lothcrs 

Adult  Mothers 

Outcomes 

OLS 

IV 

OLS 

IV 

OLS 

IV 

OLS 

IV 

OLS 

IV 

OLS 

IV 

A.  No  Covariates 

Ever  Married 

-0.021 

-0.010 

- 

- 

-0.026 

-0.023 

-0.002 

0.007 

■0.016 

-0  010 

0.0004 

-0.003 

(0,001) 

(0.153) 

- 

- 

(0.001) 

(0.021) 

(0.001) 

(0.020) 

(0.002) 

(0.0391) 

(0.0009) 

(0.014) 

Married  Now 

-0.025 

-0.062 

-0.0075 

-0.055 

-0.035 

-0.082 

0.008 

-0.035 

-0.015 

-0.066 

0.018 

-0.048 

(0.002) 

(0  024) 

(0.001) 

(0.020) 

(0.002) 

(0.030) 

(0.002) 

(0.036) 

(0.003) 

(0.051) 

(0.002) 

(0.025) 

Divorced 

-0.013 

0.011 

-0012 

0.012 

-0.012 

0.0044 

-0.017 

0.022 

-0.024 

-0010 

-0.022 

0.016 

(0,001) 

(0.016) 

(0.001) 

(0.016) 

(0.001) 

(0.0195) 

(0.002) 

(0.027) 

(0.002) 

(0.035) 

(0.001) 

(0.017) 

Divorced  or  Separated 

0.0023 

0.053 

0.0056 

0.055 

0.0070 

0.057 

-0.011 

0.046 

-0.0030 

0.048 

■0018 

0.049 

(0.0013) 

(0.019) 

(0.0014) 

(0.020) 

(0.0016) 

(0.024) 

(0.002) 

(0.032) 

(0.0027) 

(0.043) 

(0001) 

(0.021) 

In  Poverty 

0.143 

0.095 

0.124 

0.082 

0.167 

0.107 

0.070 

0.088 

0.178 

0.143 

0.083 

0.062 

(0.002) 

(0.029) 

(0.002) 

(0.020) 

(0.002) 

(0.031) 

(0.002) 

(0030) 

(0003) 

(0050) 

(0.002) 

(0.024) 

Weifare  Recipient 

0.067 

0.033 

0.050 

0.032 

0.079 

0.028 

0.030 

0.049 

0.091 

0018 

0.030 

0.032 

(0.001) 

(0.018) 

(0.001) 

(0.014) 

(0.002) 

(0.024) 

(0.002) 

(0.022) 

(0.003) 

(0.042) 

(0.001) 

(0.017) 

B.  With  Covariates 

Ever  Married 

0.0026 

■0.0051 

- 

- 

0.0038 

-0.025 

0.0061 

0.023 

0.0087 

-0.031 

0.0043 

0.0025 

(0.0009) 

(0.0136) 

- 

- 

(0.0013) 

(0.019) 

(0.0012) 

(0.018) 

(0.0022) 

-0.034 

(0.0009) 

(0.0126) 

Married  Now 

0.037 

-0.052 

0.037 

-0.046 

0.028 

-0.080 

0.057 

-0.011 

0.020 

-0.078 

0.047 

■0.041 

(0.001) 

(0.021) 

(0.001) 

(0.019) 

(0.002) 

(0.028) 

(0.002) 

(0.033) 

(0.003) 

(0.047) 

(0.002) 

(0.023) 

Divorced 

-0.037 

0.0071 

-0.039 

0.0067 

-0.029 

0.0010 

-0.048 

0.018 

■0  026 

■0.021 

-0040 

0.017 

(0.001) 

(0.0157) 

(0.001) 

(0.0159) 

(0.001) 

(0.0196) 

(0.002) 

(0.026) 

(0002) 

(0.035) 

(0.001) 

(0.017) 

Divorced  or  Separated 

-0.033 

0.048 

-0.036 

0.047 

-0.023 

0.054 

-0.049 

0.038 

-0.013 

0.040 

-0.041 

0.048 

(0.001) 

(0.019) 

(0.001) 

(0.019) 

(0.002) 

(0.025) 

(0.002) 

(0.031) 

(0.003) 

(0.043) 

(0.001) 

(0.020) 

In  Povertv 

0.093 

0.097 

0.087 

0.085 

0.113 

0.113 

0.055 

0.074 

0.148 

0.186 

0.065 

0.061 

(O.OCl) 

(0.021) 

(0.001) 

(0.019) 

(0002) 

(0.028) 

(0.002) 

(0.029) 

(O003) 

(0.047) 

(0.002) 

(0.022) 

Welfare  Recipient 

0039 

0.033 

0.031 

0.032 

0.048 

0.032 

0.020 

0.041 

0.071 

0.043 

0.022 

0.030 

(0.001) 

(0.017) 

(0.001) 

(0.014) 

(0.002) 

(0.023) 

(0.001) 

(0.021) 

(0.003) 

(0.040) 

(0  001) 

(0016) 

Notes:  The  samples  and  models  are  as  in  Table  2,  with  difTerent  dependent  variables.  Standard  errors  are  reported  m  parentheses.  All  calculations  use  sample  weights. 


TABLE  4:  POTENTIAL-ASSTGNMENT  SUBPOPULATIONS 


NoC( 

ivariates 

With  Covanates 

P[D=I] 

P. 

Pa 

P,. 

P..-Pn 

0 

Pc 

Pa-P„ 

Sample 

(1) 

(2) 

(3) 

(4) 

(5) 

(6) 

(7) 

(S) 

All  Women 

0.375 

0.063 
(0.0018) 

0.344 

0.594 

-0.250 
(0.0018) 

0.619 

0.062 
(0.0017) 

-0.250 
(0.0018) 

Ever  Married 

0.370 

0.066 

0.337 

0.597 

-0.261 

0.607 

0,066 

-0.260 

-■ 

(0.0018) 

(0.0018) 

(0.0017) 

(0.0018) 

No  College 

0.405 

0.065 
(0.0023) 

0.372 

0.563 

-0.191 
(0.0023) 

0.696 

0.064 
(0.0022) 

-0.191 
(0.0023) 

Some  College 

0.328 

0.059 
(0.0027) 

0.298 

0.642 

-0.344 
(0.0027) 

0.510 

0.059 
(0.0027) 

-0.344 
(0.0027) 

Teen  Mothers 

0.500 

0,064 
(0.0034) 

0.468 

0.468 

-0.0006 
(0.0034) 

0.999 

0.063 
(0.0033) 

-0.0005 
(0.0034) 

Adult  Mothers 

0.324 

0.062 
(00020) 

0.293 

0.645 

-0.352 
(0.0020) 

0.502 

0.062 
(0.0019) 

-0.351 
(0.0020) 

Notes:  The  first  column  reports  the  proportion  treated.  The  second  column  shows  the  proportion  of  compliers  in  the  sample, 
which  is  given  by  the  first-stage  effect  ofSame  sex.  The  estimates  of  the  proportion  of  always-takers  and  never-takers  and 
the  parameter  9  were  calculated  as  described  in  the  text.  Estimates  with  covariatcs  were  calculated  as  described  in  the 
appendix.  Standard  errors  are  reported  in  parentheses.  All  calculations  use  sample  weights. 


TABLE  5:  SYMMETRIC  FIRST  STAGE  SAMPLES 


Symmetric 

Sample  I 

Symmetric 

Sample  11 

7t„(X)>=0.4 

ito(X)<0.4 

All  Women 

Teen  Mothers 

&  7io(X)<=0.6 

or7to(X)>0.6 

7to(X)>=0.35 

7ro(X)<0.35 

Variables 

(1) 

(2) 

.(3) 

(4) 

(5) 

(6) 

First  Stage  (OLS  estimates") 

Coefficient 

0.0628 

0.0638 

0.0713 

0.0588 

0.0684 

0.0576 

(0.002) 

(0.003) 

(0.003) 

(0.002) 

(0.003) 

(0.002) 

Constant 

0.344 

0.468 

0.471 

0.296 

0.465 

0.253 

(0.001) 

(0.002) 

(0.002) 

(0.001) 

(0.002) 

(0.001) 

Outcomes  (IV  Estimates') 

Worked  for  pay 

-0.084 

-0.026 

-0.038 

-0.109 

-0.080 

-0.092 

(0.027) 

(0.051) 

(0.045) 

(0.034) 

(0.038) 

(0.039) 

Weeks  worked 

-5.15 

-4.76 

-3.72 

-6.03 

-5.90 

-4.71 

(1.30) 

(2.40) 

(2.21) 

(1.62) 

(1.83) 

(1.86) 

Ever  Married 

-0.010 

-0.0098 

-0.016 

-0.0031 

-0.0051 

-0.0092 

(0.015) 

(0.0391) 

(0.032) 

(0.0170) 

(0.0267) 

(0.0162) 

Married  Now 

-0.062 

-0.066 

-0.033 

-0.066 

-0.075 

-0.039 

(0.024) 

(0.051) 

(0.045) 

(0.027) 

(0.038) 

(0.028) 

Divorced 

0.011 

-0.010 

-0,029 

0.025 

0.012 

0.0057 

(0.016) 

(0.035) 

(0.031) 

(0.018) 

(0.026) 

(0.0189) 

Divorced  or  Separated 

0.053 

0.048 

0.020 

0.063 

0.068 

0.033 

(0.019) 

(0.043) 

(0.038) 

(0.022) 

(0.0032) 

(0.023) 

In  Poverty 

0.095 

0.143 

0.095 

0.087 

0.136 

0.048 

(0.023) 

(0.050) 

(0.044) 

(0.027) 

(0.036) 

(0.028) 

Welfare  Recipient 

0.033 

0.018 

0.021 

0.034 

0.027 

0.032 

(0.018) 

(0.042) 

(0.035) 

(0.020) 

(0.029) 

(0.020) 

Number  of  Observations 

380007 

110156 

103803 

276204 

162264 

217743 

Notes:  Columns  1  and  2  repeat  estimates  from  Tables  2  and  3,  for  models  without  covariates.  Columns  3-6  report  estimates  using  samples 
selected  as  described  in  the  text.  Standard  errors  are  shown  in  parentheses.  All  calculations  use  sample  weights. 


TABLE  6:  IMPUTATION  OF  ATE 


Outcome 

Sample 

OLS 

(1) 

No  Selection 
Alternative 

(2) 

LATE 
(3) 

ATE 
9=  1 

(4) 

ATE 

e  =  P[D=l|Z=l] 

/(1-P[D=1|Z=0]) 

(5) 

Weeks 
Worked 

All  Women 

-7.34 
(0.08) 

-7.56 
(0.12) 

-5.15 
(1.30) 

-4.31 
(I  27) 

-3.19 
(1.45) 

Ever  Married 

-7.12 
(0.09) 

-7.33 
(0.13) 

-5.09 
(1.27) 

-4.41 
(123) 

-3.45 
(1.43) 

No  College 

-722 
(0.11) 

-7.33 
(0.16) 

-6.52 
(1.60) 

-5.73 
(1.57) 

-4.94 
(1,70) 

Some  College  or  + 

-6.59 
(0.14) 

-6.90 
(0.21) 

-3.21 

(2.18) 

-2.38 
(2.08) 

-0,60 
(2,69) 

Teen  Mothers 

-7.47 
(0.15) 

-7.66 
(0.22) 

-4.76 
(2.40) 

-4.76 
(2.40) 

-4,76 
(2.40) 

Adult  Mothers 

-7  19 
(0.10) 

-7.43 
(0.15) 

-5.26 
(1.57) 

-4  06 
(1,48) 

-2.19 
(1.91) 

Divorced  or 
Separated 

All  Women 

0.0023 
(0.0013) 

00005 
(0.0019) 

0.053 
(0.019) 

0,028 
(0,019) 

0.0092 
(0.0216) 

Ever  Married 

0.0056 
(0.0014) 

0.0043 
(00020) 

0055 
(0.020) 

0024 
(0,019) 

-0.0002 
(0.0221) 

No  College 

0.0070 
(0.0016) 

0.0053 
(0.0024) 

0.057 
(0.024) 

0,032 
(0,024) 

0.011 
(0.026) 

Some  College  or  + 

-0.011 
(0.002) 

-0.014 
(0.003) 

0.046 
(0.032) 

0,034 
(0,029) 

0.034 
(0.037) 

Teen  Mothers 

-0.0030 
(0.0027) 

-0.0063 
(0,0040) 

0.048 
(0.043) 

0,048 
(0,043) 

0.048 
(0.043) 

Adult  Mothers 

-0.018 
(0.001) 

-0.021 
(0.002) 

0.049 
(0.021) 

0,017 
(0,019) 

-0.0024 
(0.024) 

In  Poverty- 

All  Women 

0.143 
(0.002) 

0.150 
(0.002) 

0.095 
(0.023) 

0,049 
(0024) 

-0,0023 
(0,029) 

Ever  Mamed 

0.124 
(0.002) 

0.129 
(0.002) 

0.082 
(0.020) 

0,054 
(0.021) 

0,020 
(0,026) 

No  College 

0.167 
(0.002) 

0.175 
(0.003) 

0.107 
(0.031) 

0.061 
(0.032) 

0,014 
(0,036) 

Some  College  or  + 

0.070 
(0.002) 

0.071 
(0.003) 

0.088 
(0.030) 

0.059 
(0.031) 

0,03! 
(0,042) 

Teen  Mothers 

0.178 
(0.003) 

0.181 
(0.005) 

0.143 
(0.050) 

0.143 
(0.051) 

0,143 
(0.051) 

Adult  Mothers 

0.083 
(0.002) 

0.088 
(0.003) 

0.062 
(0.024) 

0.017 
(0.025) 

-0.039 
(0.033) 

Welfare 
Recipient 

All  Women 

0.067 
(0.001) 

0.072 
(0.002) 

0.033 
(0.018) 

0.0058 
(0.0185) 

-0.026 
(0.022) 

Ever  Married 

0.050 
(0.001) 

0.052 
(0.002) 

0.032 
(0.014) 

0.015 
(0015) 

-0.0051 
(0.0182) 

No  College 

0.079 
(0.002) 

0.085 
(0.003) 

0.028 
(0.024) 

-0,001 
(0,025) 

-0.031 
(0.029) 

Some  College  or  + 

0.030 
(0.002) 

0.030 
(0.002) 

0.049 
(0.022) 

0,037 
(0,022) 

0.029 
(0.030) 

Teen  Mothers 

0.091 
(0.003) 

0.096 
(0,004) 

0.018 
(0.042) 

0,018 
(0,043) 

0.018 
(0,043) 

Adult  Mothers 

0.030 
(0.001) 

0.032 

(0  002) 

0.032 
(0.017) 

0,006 
(0,017) 

-0024 
(0  023) 

Notes:  Columns  1  and  3  repeat  estimates  from  Tables  2  and  3.  Column  2  shows  the  no-seleciion  ahemative  under  Resmciion  I  and 
for  the  selection-bias  test.  Column  4  repones  estimates  of  ATE  under  Restriction  3  and  column  5  reports  estimates  of  ATE  under 
Resinction  4. 
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